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THE SKILLS OF CLINICIANS IN ANALYSIS OF PROJECTIVE TESTS 
JAMES QUINTAR HOLSOPPLE AND JOSEPH G. PHELAN 
Veterans Administration, Washington, D. C. Stevens Institute of Technology 


INTRODUCTION 


With the recent sudden increase in the number of psychologists engaged in clin- 
ical practice, there has arisen an interest in the selection of persons to be trained for 
this work and to that end an interest in the development of reliable measures of 
diagnostic skill. Challman“ and Kelly “?: report on attempts to determine who 
may best be trained to formulate diagnoses. The University of Michigan group “* 
was interested in the possibility of ‘demonstrating the relative and cumulative 
effectiveness of all available techniques as predictive of future professional success’ 
in this highly individualized specialty. In addition to paper and pencil tests of in- 
telligence, achievement, personality, attitudes and temperament, and interviews, 
this group used individual and pooled ratings based on situational and real life pro- 
cedures designed to reveal “aptitude for effective interpersonal relationships”. 
Candidates for clinical training were asked to perform in a variety of situational 
tests and were rated by staff members. Staff members first made independent judg- 
ments which were later pooled in staff team predictions. Predictions, in the form of 
ratings, were made on 15 tentatively defined criterion attributes, (e.g. skill in psy- 
chometry, skill in diagnostic interviewing, etc.), and on 23 tentatively defined per- 
sonality traits, deemed to be predictive of one or more dimensions of successful 
clinical performance. This research showed that the conference method fails to en- 
able judges to predict who may best be trained as diagnosticians, as do unstructured 


interviews, objective and projective tests and clinician-technique combinations of 
various sorts. 


Estes“) found, when psychiatric social workers were asked to make judgments 
about personality characteristics, that some social workers could demonstrate much 
greater ability to make use of cues than others. Existence or degree of this skill 
seemed not related to age, length of service, sibling status or to whether or not the 
judge had been analyzed. Horn“, working with Murray “®), used an analysis of 
variance technique to compare the relative importance for successful diagnosis of 
the ability of the judge and the aspect of personality measured. His judges differed 
greatly in ability even though similarly trained. Horn felt that, in presentation of 
data and in teaching of the clinical process it is important to single out or emphasize 
those aspects of personality which make for difficulty in diagnosis and to recognize 


the relative importance and weight to be attached to various kinds of clinical evi- 
dence. 


The studies here reviewed dealing with the process of diagnosis, stress the find- 
ing that among clinicians with roughly equivalent training, some seem able to per- 
form at an easily demonstrable higher level of efficiency than others similarly trained 
and experienced. It would be desirable to try to isolate those clinicians who possess 
the skill in order to study the individuals and the process, and ultimately to devise 
a method of selecting those who may best be trained. 


Such a plan for isolating this diagnostic skill might borrow from the operational 
methods used in other disciplines, using a matching technique requiring the identifi- 
cation of a variety of data as belonging to one and the same individual. This seems 
preferable to the method used by Horn“! and Estes “°) where the data were identi- 
fied as belonging to the unique individual by requiring that the judge assign trait 
names or rate observed behavior on a list of traits and express their judgments in a 
series of ratings of abstract personality variables, on the meaning and value of 
which there could not be general agreement. 
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PROBLEM 


It is proposed to develop a practical test to isolate the excellent judges of per- 
sonality patterns, and to discriminate the cues upon which their judgments are based. 
If such a test can clearly distinguish those who have such ability, it might be adapted 
for use to select those who have such ability at an early stage in their development. 
A test requiring that clinicians match projective protocols and identify them as 
products of the unique individual might throw some light on the mechanics of the 
diagnostic skill. The clinician could demonstrate ability by the frequency and con- 
sistency with which he recognizes productions and by the reasons he gives for choices 
and classifications. This procedure might contribute knowledge of the diagnostic 
process, and perhaps convert ‘‘private correlations” presently in use, informal and 
unverbalized, into mathematical statements into which the values of the individual 
on each of the factors could be fitted. After matching, clinicians would be required 
to indicate subjective certainty in matching. It would be possible in this case to 
compare accuracy with subjective certainty, and then to compare our findings with 
those of the earlier investigators and re-evaluate the relationships between sub- 
jective certainty and success in matching projective materials. 


The aims of this experiment are as follows: 
a. To devise a “‘test”’ which would enable the individual clinician to demon- 
strate skill in utilizing projective techniques. 
b. To indicate statistical significance of the matching performance of the 
judges as a group. 
ce. To study the relationship between expressed subjective certainty and 
accuracy of judgment for experienced clinicians. 


THE Matcuine Metuop 


Since the core of this problem lies in the determination of a practical technique 
for estimating presence and degree of diagnostic skill, the test performance should 
involve identification of a global array of records as having been produced by the 
same person by some type of matching technique. A survey of the literature shows 
that matching techniques have been widely used for the estimation of the validity of 
projective materials. Cronbach? asserts the primacy of this method in a discussion 
of statistical studies of Rorschach validity: 

“A Rorschach record is interpreted qualitatively and in a complex manner when the test 
is given in aclinic. A favorite ee for evaluating Rorschach results is matching . . . which 
permits a study of the case as a whole. When a set of Rorschach records, interpreted or not, 
and another set of data are available, one may request the judges to match ‘the two sets in pairs 

. . Because of the peculiar character of clinical tests and the limitation of the conventional and 
mathematically sound procedures, and because statistical methods for such tests have not been 
fully develo . .. matching procedures in which a clinical synthesis of each Rorschach record 
is compared with a criterion are especially appropriate .... A portrait based on the Rorschach 
may be nearly right, yet be mismatched because of minor false elements.” 


Cronbach emphasizes that the limitations of the matching method are not statistical 
in nature but lie in the human limitations of the judges. 

Horn“), presented experienced clinicians with a variety of biographical and 
projective materials seria lly. Judges were asked to check a list of personality trait 
names which seemed applicable for various subjects. His judges found that the 
biographical data contributed more to an understanding of the person, enabled them 
to predict behavior with greater accuracy than did other instruments. 

Troop’s study “”) required that judges match two Rorschach records for each 
person. 114 matches were correct out of a possible 120. Judges considered 5 pairs 
of records at a time; contingency coefficient 0.88. A coefficient of 0.40 was ob- 
tained when judges attempted to match the records of each case. 

Krugman “*? validated Rorschach methods by proving that different evaluations 
of the same Rorschach protocol could be matched; that interpretations could be 
matched to the raw record and to criteria based on a case study. She found no differ- 
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ences in ability to match among her judges. In her work seven judges matched sev- 
eral series of Rorschach records with Rorschach interpretations. 25 Rorschach 
personality interpretations were matched with clinical case study abstracts in match- 
ing groups of 5 pairs. Krugman’s results, reported in terms of Vernon’s“®) conting- 
ency coefficient values, indicate high validity in all matching experiments. 

Vernon *!) reported little success after extensive investigation of estimation of 
personality characteristics by the matching method. ‘We are hardly justified in 
talking of a general matching ability’, he says. In his work, average intercorrela- 
tions between eight tests was only 0.085, giving a corrected consistence of —0.43. 
Even within a single type of material, as with 20 sets of photographs and vocations, 
or 35 sets of drawing of houses and men, the corrected reliability of the judges rating 
was only 0.055 and —0.57. 

Valentine “®) found intuitive judgments of personality unreliable. He required 
that judges (men and women students preparing to be teachers) rate selected person- 
ality traits of children and youths. Matching scores of judges, based on the criterion 
of ratings of trait-names, are no greater than chance expectation. Valentine also 
investigated the relationship between the degree of accuracy which his student- 
teacher judges expressed in their intuitive judgments when matching for personality 
characteristics and the accuracy of their judgments. Those judgments “‘which the 
student marked as having been given with special confidence proved often more 
inaccurate than other judgments.”’ Polansky “* obtained a correlation of 0.008 be- 
tween the accuracy of predictions derived from case histories and the indications of 
judges as to “‘how well they thought they knew” the subjects of the histories. Wall- 
in °) in discussing such findings, concluded that some judges tend to “‘project’’, to 
mis-read the attitudes or motivations of others because they naively inject their 
personal feelings into observations of other people’s behavior. This kind of judge, 
the projector, is likely to be wrong while absolutely certain he is right. 

Vernon “®. 2°, 24) advocates the matching method for reliability and validity 
studies of psychograms and other measures, pointing out extraneous factors which 
may be controlled in matching experiments if relationships between matched series 
of data are to be valid. 


1. It is conceivable in matching that some of the judgments might reduce themselves to a 
process of elimination rather than genuine judging. e propose to control this by requiring 
that our judges match unequal elements. This provides a fairly direct control over the 
difficulty of the experiment. 

2. Extraneous elements in the materials can afford peripheral clues to correct matching. An 
effort should be made to exclude characteristic turns of speech or other accidental clues. 

3. The validity of the matching result is not affected statistically by the number of elements 
matched at a time. In spite of statistical equivalency, the results of experiments with different 
numbers of things matched must differ subjectively. Vernon) found that the number of 
things to be optimally matched must vary for each type of material depending on the number 
of impressions the average judge can keep clear in his mind. 


THE MetHop 


A matching task of 16 documents was presented to 20 trained and experienced 
clinical psychologists, all of whom were Doctors of Philosophy and had at least two 
years of clinical experience. The documents were presented in four arrays with 
unequal matching; in each array the four tests given were representative of the same 
six individuals, as follows: 

Array A. 4 a dictated autobiographies culled at random from 6 individual bio- 
graphies.* 
Array B. 4complete T.A.T. protocols selected from the T.A.T. protocols of the identical 6. 


Array C. 4 complete Rorschach results (with scoring and location charts and Sentence Com- 
pletion responses selected from records of the same group of 6 individuals.? 

Array D. 4 sets of responses to standardized tests—Thurstone Primary Mental Abilities Test; 
Kuder Preference Record; and Guilford-Zimmerman Temperament Tests selec 
from the same 6. 


en norm for Autobiography” prepared by H. A. Murray at The Harvard Psychological Clinic 
was used. 5) 
*The Sentence Completion Test employed was the Holsopple-Miale Test. 
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Subjects differed widely in diagnosis, age, and socio-cultural background. 
Vernon @") stresses the effect of homogeneity on the validity of matching and sug- 
gests the use of as wide a diversity of materials as is practical. The subjects were: 

(A) 34 year old male, convict, college graduate, convicted on several charges of molesting young 


girls; diagnosed by prison psychiatrist as psychopath-sexual deviate, though Rorschach an 
T.A.T. record showed many signs and indications of schizophrenia. 


(B) 23 year old male, mental clinic outpatient, unmarried, with apparently no hetero-or-homo- 
sexual experience; diagnosis: anxiety state. 


(C) 28 year old married female, research scientist, living successfully in community, no need for 
treatment. 


(D) 18 year old high-school student, male, under psychiatric treatment, anxiety, homosexual 
panic. 

(E) 36 year old female, probationer, prostitute; no diagnosis, bland, immature, without fore- 
sight or concern. 


(F) 36 year old male, business executive; no diagnosis, apparently not in need of treatment. 


All materials were edited with a view to eliminating extrinsic clues which might 
make matching possible without an evaluation of data. Where possible topical 
references, turns of speech or mannerisms were excised from protocols. Unequal 
matching was utilized. This procedure maximized the difficulty of the matching 
task and kept the number of units to be matched within practicable limits. Gaps 
occurred in the matching scheme as illustrated in TABLE 1 which shows the pattern 
of presentation of the arrays of tests A, B, C and D. Each array consisted of four 
documents selected from the materials from the six subjects. 


TaBLE 1. THe ScHEME OF PRESENTATION OF THE Task (IN Eacu Array, Four UNITts, 
REPRESENTATIVE OF THE Six SUBJECTS) 











SUBJECTS 





Arrays 
of 
Tests 


Mental High 

O.P.D. Research School Proba- Business 
Clinic Scientist | Student tioner Executive 
(Male) | (Female) (Male) (Female) (Male) 


Convict 
(Male) 





Autobiography | x x x x 


T.A.T. xX x 


Rorschach-Sentence 
Completion x 


D. Objective Battery 
(PMA, Kuder ST DCR) | 











| 
| 
= 


Xx x xX 











The described matching task was presented to the judges in mimeographed 
form. There was no limit to the amount of time they could spend on it or the number 
of times they could re-read it. It was accompanied by the following instructions: 

Appended hereto are 16 documents which are the results of standardized autobiographical 
interviews, objective tests and projective techniques administered to six individuals. You are 
asked to indicate to which of the six individuals any two or more of these documents could be 
attributed, so that the documents which you assign to one person could only have been produc- 
ed by one person. Each protocol has identifying call letters. Indicate on the attached form 
your matching of two, three or four documents as belonging to one person, indicate whether 
you feel quite sure, comparatively sure, or unsure of your choice by checking the appropriate 
space. Each of the ‘documents is identified by a pair of letters. This pair is the designation to be 
used in describing the matching judgment; for example, in filling out the accompanying form 
one might write: Documents AB ( iographical sketch), CD (Rorschach — Sentence Com- 
pletion), EF (Thematic Apperception Test) seem to belong together, etc. 


The primary aim of this task was to provide a vehicle by means of which judges 
could demonstrate skill in identifying and in relating inferred personality character- 
istics. It was necessary to devise a matching situation of sufficient difficulty to con- 
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stitute a challenge and at the same time one which could yield valid results for exper- 
ienced clinical psychologists. Because of the number of variables to be kept in mind 
simultaneously, the number of units of material which could be presented at once 
had to be limited. Vernon!) points out that we have no clear idea of the point at 
which a given task becomes so difficult that the average judge cannot perform. 
“The point varied with the type of materials used and the optimum level of difficulty 
would vary with the experience, training and intelligence of the judges participating 
in the task.” 

The judges could attain a score of 36 points if all tests were correctly matched. 
If Judge A matched the Biography with T.A.T. for the Mental Clinic Patient he 
was credited with 1 point; Biography with Rorschach, 1 point; Biography with Ob- 
jective Tests, 1 point; also 1 point for T.A.T. — Rorschach match; 1 point for T.A.T. 
— Objective Test match and 1 point for Rorschach — Objective Test match: a total 
of 6 points for all correct matches for each of six subjects. The judge received 1 
point, naturally, in every case where he correctly located gaps in the data, as for 
example, when he judged that none of the given T.A.T. protocols went with the 
Biography of the Male Convict. The judges indicated that they had followed the 
suggestions in the instructions. 


RESULTS 


Performance of Judges. Following Vernon and Chapman? on the statistics 
of the matching method, it was possible to calculate the frequency with which a given 
number of correct matches could be expected to occur when the judge was required 
to arrange six arrays of data, with six units in each array, against every other ar- 
ray. See appendix for formulae and calculations. 

The performance of the judges, individually and as a group, is compared with 
chance expectancy in table 2. Seven judges (A, E, H, I, M, N, R) performed better 


TABLE 2. SIGNIFICANCE OF MATCHING JUDGEMENTS FOR INDIVIDUAL JupGES. ToTaL NUMBER OF 
Jupces = 20. 14 Jupces CompLetep Task. ToTat NUMBER OF PossIBLE CoRRECT JUDGEMENTS 
FOR Eacu JuDGE = 36, orn 6 In Eacu CaTEGORY 











NUMBER CORRECT 


Biog. | Biog. | T.A.T. | T.A.T. . || Total | Total 
Ror. Obj. Ror. Obj. Correct | Incorrect 
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Significance of group of judges (N = 14) 
Performance of 6 individual judges Pp 
Performance of 9 individual judges Pp 
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than chance at the .01 level of confidence. Three other judges (C, F, K) performed at 
the .05 level, making ten judges working at a level that such performance could be 
expected to occur by chance 5 times out of one hundred (p < .05). The remaining 
ten judges (B, D, G, J, L, O, P, Q, 8, T) functioned within chance expectation 
(p > .05). Thus, ‘differences between workers with similar training and experience 
in working with projective materials exist and are demonstrable. Some analyze and 
utilize these materials in an immediately relevant way and consistently. That they 
do so is shown in their success in recognition and identification of the individual 
person through a variety of expressive instruments. 

In order to evaluate the performance of the judges as a group for significance, 
we need only consider performance of judges who complete the entire task, and score 
the task such that the greatest number of correct arrangements of data can equal 18, 
(one correct credit for each document successfully matched against Biography). In 
this situation, X? = 5.26, p < .05. Chance alone could not have operated in this 
circumstance; the performance of judges as a group exceeds chance expectation. 
(See appendix for X? computation) 


Qualitative differences in matching performance with various materials. It is im- 
portant to compare the ways in which judges go about their task, how they work 
with different kinds of materials. Some judges might perceive the inter-relationships 
of material derived from two documents, such as Rorschach and Biography, and at 
the same time seem unaware of implications of other materials such as the T.A.T. 
(cf. Table 3 — Judges C, K, M and L). Similarly, judges may do well with one 
subject with a particular kind of dynamics but be unable to empathize with or ap- 
preciate the implications of the problem of some other individual. Other judges may 
be able to perform significantly above chance with every kind of material and all 
subjects, (cf. Judge H). 

TABLE 3 brings out some of the results of the exceptionally good and the poor 
judges in their handling of the various projective documents and the biographical 
data. Good judges exceed poor judges in a two to one ratio in comparing Rorschach 


TaBLe 3. Resuuts or Matcuine Task. PERFORMANCE: OF Best AND PoorEsT MATCHERS IN 
CoMPARISON OF BIOGRAPHICAL Data AND ProgecTive MATERIALS: (NUMBER OF JUDGMENTS EacH 
CaTEGOoRY = 6 








Clinicians who made 11 or more Clinicians with fewer than 
correct choices (p<.05) 11 correct choices (p> .05) 








| Biog. vs} Biog. vs | T.A.T. vs Biog. vs| Biog. vs_ |T.A.T. vs 
Judge TAT. Rorschach | Rorschach Judge | T.A.T. | Rorschach | Rorschach 
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with T.A.T. protocol, the ratio closely approavhes two to one when they matched 
biography and T.A.T., and is about three to two in matching Biography and Ror- 
schach. Both good and poor judges do much better when matching all projective 
materials with Biography than when matching Rorschach with T.A.T. or objective 
and projective tests. 

Certainly an observation which bears further investigation is the possibility 
that the Biography covers a wider range of attitudinal material and thus overlaps to a 
a significant degree both Rorschach and T.A.T., whereas the Rorschach and T.A.T. 
are more specific, more limited in range, with less overlap, thus presenting greater 
difficulty for matching. Both the Rorschach and T.A.T. are considered by clinicians 
to deal with quite distinct “layers” of personality. 


Validity of Tests Used. Some evidence concerning the validity of projective 
devices may be elicited from this experiment. It is necessary: 


(1) = ga only the performance of those clinicians (14 judges, judges A - N) who completed 
the task. 

(2) Vernon’s statistic“) for matching, the contingency coefficient and its probable error, can- 

not be utilized because of our employment of unequal matching series. In our situation, in 
which four units in each array are given and two withheld, the two absent units must be 
considered interchangeable; therefore Vernon’s formula is inapplicable. The X? statistic 
can be employed, the value which various tests have for clinicians can be measured by suc- 
cesses in matching compared with chance expectancy. 
The performance of judges as a group can be evaluated on the materials of the various orders 
by considering the matching of each array against every other array as a separate matching 
experiment. The theoretical distribution to be expected when 14 judges arrange one array 
of six documents against each other array is given in Table 4. (cf. appendix for calculation 
of these values). A measure of the deviation of our sample from this hypothetical population 
ratio can be obtained, and a means of judging whether or not our measure could be expected 
in sampling by use of the X?. 


TABLE 4. CALCULATION OF THEORETICAL DISTRIBUTION 








Number of pairs Number of judges to be ex- 
correctly matched pected to achieve such score 





3.0632 
4.9588 
3.7184 
1.4588 
0.728 
0.728 











’ Less than 2 2 or more 
Theoretical correct correct 
number of judges matchings matchings Sum 





1 degree of freedom 8.022 5.978 14.0000 








TaBLE 5. X? VALUES OBTAINED FROM COMPARING THE ARRAYS OF 4 Units REPRESENTATIVE OF THE 

Srx SuBJEcTs WITH THE OTHER ARRAYS OF DIFFERENT MATERIALS. THE THEORETICAL EXPECTATION 

OF THE NUMBER OF JupDGES CorRECTLY MatcuHinG LEss THAN 2 Is 8, AND THE EXPECTED NUMBER OF 
JupGEs MaTcHING 2 on Morr 1s 6. 








Obtained Frequencies 
Matching Comparison Less than 2 2 or More xX? Level of Confidence 








Biography vs. T.A.T. Alf p<.10 
Biography vs. Rorschach : p<.001 
Biography vs. Objective Tests ; p<.10 
T.A.T. vs. Rorschach j p<.30 
T.A.T. vs. Objective Tests ; p<.30 
Rorschach vs. Objective tests , p<.10 
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From table 5, it is seen that the judges were able to match Rorschach results 
with Biography with the highest excess of obtained over expected frequencies. The 
matchings of Biography vs. T.A.T., Biography vs. Objective Tests, ‘and Rorschach 
vs. Objective tests were next in elficiency. The type of miatching in which the judges 
were least competent involved T.A.T. vs. Rorschach comparisons and T.A.T. vs. 
Objective Tests. 

Judges in our experiment were less successful in matching than Krugman’s 
judges. Her judges were presented with a much less challenging task, having fewer 
complex variables to keep in mind simultaneously. 

The consistency of high level performance on the part of some judges in our 
experiment and the wide differences in performance between judges seem to indicate 
that an ability to match materials of this type does exist, can be isolated, demon- 
strated and subjected to further study. 


Methods Used by Judges. The number of judges in the group who were able to 
utilize the ‘‘series of tests” approach can be observed by inspection. Table 6 indi- 
cates the method and extent to which judges are able to recognize the subject by 
identifying most or all of his tests as being his as a matter of fact. 


TasLe 6. Recorp or 14 Jupcps (WHO CoMPLETED THE TASK) IN CORRECTLY 
Matcuina Recorps Propucep BY THE Srx SuBJEcTs IN SERIES, WHEN TESTS 
WERE MatTcHEepD AGAINST BIOGRAPHY AS A CRITERION. 








Number of tests correctly matched 
Judge against biography 


1 
1 





DOWD POR CII Won © 


A 
B 
C 
D 
E 
F 
G 
H 
I 
J 
K 
L 
M 
N 











In general, judges who match pairs of tests also excel in matching tests against bio- 
graphy in series. Judges A, C, E, H, I, and M, high scorers in paired matchings, 
seem able to follow the procedure of visualizing the individual, matching all his 
tests in series. Judges were able, not only to identify any two tests as belonging to- 
gether, but also to identify tests in series as “‘going with” a given biography. In this 
case judges seem able to recognize the individual through several of his productions, 
rather than merely to match two tests. 


Significance of Types of Errors. Vernon asserts that the most important of all the 
extraneous factors which influence the matching experiment is the homogeneity 
or diversity of the materials. Homogeneity is dependent on the distinctiveness or 
range of unlikeness among the subjects whose modes of expression and reflections 
of personality characteristics made up the data. 

Randomness is usually examined by means of the standard deviation of sub- 
jects’ scores, but in matching, the difference between the subjects in any one set of 
materials is qualitative so that no measure comparable to the standard deviation is 
available. A study can be made of the second choices and the ‘“‘good”’ errors (those 
made by many judges, where similar aspects of two subjects are more pronounced 
than their differences). A study of these errors affords insight into the ways in 
which test results of seemingly widely different persons can seem similar to the 
analyzing judge. 
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TaBLE 7, ANALYSIS oF MISMATCHES Mang By 20 JUDGES 








Type of Mismatch Number Per Cent 





Biography of convict matched with T.A.T. 12 60 
of female scientist 


Biography of clinic outpatient subject 40 
matched with T.A.T. of female probationer 


Biography of convict with Rorschach of 40 
Clinic Patient 


Rorschach of high school student with T.A.T. 
of clinic outpatient 


Rorschach of convict with biography of 
clinic outpatient 


Rorschach of clinic outpatient with bio- 
graphy of female probationer 











Reasons given by the clinicians for mismatches tend to shed some light on 
method of analysis: 


1. Convict (educated chemicai engineer) was confused with female scientist because: 

a. Methodist religion 

b. T.A.T. expression of dissatisfaction over inability to compete with males, 
sexual inadequacy, being taunted about something small. 

c. Reaction to strict father, rigidity of attitude at home, social pressure. 
Extensive vocabulary, reference to schools and education. 
Karly sex prohibition, concern with sex play, exhibitionism, discussion of homo- 
sexuality in both records. 


Reasons for confusion of mental clinic outpatient (male, age 23) with female pro- 
bationer: 
a. Vocabulary limited, low educational level, vulgarisms, narrow range of interests 


b. Immaturity, childishness, lack of planning or foresight, irresponsibility, living 
for moment, repressing guilt. 


ce. Reference to movies and dancing. 

d. Attitudes typically female. 

Reasons for confusion of convict with mental clinic outpatient: 
Rorschach obviously psychotic, must belong to patient under treatment. 
Impulsiveness in Rorschach and in biography. 
Avoidance of women, sexual confusion, difficulties with women. 
Rorschach mention of golf course, biography hit in eye with golf club. 
Obsessive-compulsive. 

Reasons for confusion of high-school student (male, age 16) with mental clinic out- 

patient (male age 23—high school education): 

a. Impulsiveness, immaturity. 

b. Naivete, fear and avoidance of women. 


c. Expression of being threatened by strict, unreasonable, overwhelming de- 
manding father figure. 


d. Fears of being criticized or disliked by age-mates (males). 


In this experiment an attempt was made to select subjects who were heterogeneous 
as to personality characteristics and as to socio-cultural status. A consideration of 
the mismatches indicates that the subjects frequently have characteristics in com- 
mon when the data are closely analyzed that were not immediately apparent on first 
inspection; i.e. similarities in viewpoint, attitude, mood and in dynamics. 
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In work with these subjects, judges frequently misidentified because of sim- 
ilarities between subjects which appeared to outweigh differences: 

1. The female probationer and the mental hygiene clinic outpatient have a similar intellectual 
level, their cultural level is similar, many of their references are to movies and periodicals 
from which they borrow their limited stock of ideas. Both are described as immature, re- 
pressing guilt, impulsive, not looking ahead, living for the moment, non-introspective. 

The female scientist and the college educated convict are both well read and well spoken. 
Both have had scientific training. Expression of their reactions to early religious training is 
similar, as is reaction to both parents. — they are of opposite sex—both make direct 


and indirect references to striving toward symbolic masculinity. Sex identification is similar 
as is reaction to marriage partner. 


In biography one subject was identified as an outpatient of a mental hygiene clinic. Ror- 
schach of the convict was obviously that of a disturbed person though the biography did not 


mention a ee panne Many matched the biography which mentioned treatment with the 
most disturbed projective test available. To do so was to give these materials only super- 


ficial analysis. 


“Good errors” occur when similarities between subjects outweight differences so 
that the test results for the similar subject is very frequently misidentified with the 
right case. 

Analysis of good errors constitutes a non-statistical method of evaluating ran- 
domness. Such errors are not numerous in this experiment, do not occur so often as 
to invalidate the experiment. The original materials contributed by subjects can 
be regarded as adequately heterogeneous. 


Subjective Certainty of Judgments. The task of reporting degrees of certainty about 
judgments was optional in this experiment, and not all of our judges commented 
about their feelings of certainty. In only 24, or 6% of 378 judgments, did judges feel 
free to state that they were very sure of their judgments. This is in contrast to Valen- 
tine’s results, where his untrained judges declared themselves absolutely sure in 
more than 50% of judgments. In our experiment when judges were very sure, they 
were wrong more often than right (wrong in 75% of cases) but not wrong a signifi- 
cantly greater percentage of time than were the general population of judges (wrong 
in 67% of judgments). 

Experienced judges tended to be very conservative. With the exception of 
Judge G who declared himself very sure in almost every case, they did not show the 
tendency indicated by Valentine and Polansky and discussed by Wallin, to be most 
confident of correctness of judgment when most wrong in judging. The tendency to 
“‘project’’, that is, to misread motivation, to think of others’ actions in terms of 
one’s own motivations, did not seem to govern the decisions of the trained judges. 

In 29% of the judgments, clinicians felt ‘‘relatively sure’ of the correctness of 
the judgment; in 13% of cases, they indicated that they were not at all sure. In both 
instances, though more often wrong than right, they were right more often (41% of 


TasBLe 8. NUMBER AND PERCENTAGE OF JUDGMENTS CONCERNING WHICH A DEGREE OF SUBJECTIVE 
CERTAINTY WAS EXPRESSED. (NUMBER OF JUDGES 20) 








Degree of Certainty 





Judgments Relatively Not No 
Sure Sure Comment 





Correct No. 45 21 52 
% 40 42 27.1 


Incorrect No. 67 29 140 
% 59.9 58 72.9 


Total No. 112 50 192 
% 29.4 51.0 
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the time), but not significantly more often than the general population of cases (right 
32% of the time). Judges who indicated they were relatively sure or not sure, were 
explicit but conservative. They were right more often (41% of the time), but not 
significantly more often than those who did not feel free to comment or who did not 
bother to comment about feelings of subjective certainty (these were right 27% of the 
time.) Generally, those judges who were most often right described themselves as 
relatively ‘sure or as not at all sure of the correctness of their judgments. Table 8 
shows the frequencies of judgments concerning which a degree of subjective certainty 
was expressed analysed according to actual correctness or incorrectness. 


SUMMARY 


The purpose of this study concerns the clinical psychologist’s ability to arrive 
at an accurate diagnosis. In diagnosis, the data obtained from psychological ex- 
aminations are interpreted by the examiner in such a way that these data become 
meaningful in terms of the particular individual examined. The one fact which all 
the data obtained from a single person have in common is the fact of their production 
by that individual, hence diagnostic skill can be determined through the identifica- 
tion of data as having been produced by that individual. The purposes of this ex- 
periment were: 


(a) To construct a matching task to enable the clinician to demonstrate 
diagnostic skill in a direct fashion, 

(b) to evaluate said instrument as a selection device for those to be trained as 
clinical psychologists, 

(c) to evaluate judges’ matching performance in terms of significance, in- 
dividually and as a group, ; 

(d) to shed light on the validity of tests used in the battery. 


(e) to study the correspondence of subjective certainty in matching and the 
accuracy of judgments. 


A matching task of sixteen documents, four autobiographies representative of 
six people, four Thematic Apperception Test protocols representative of the same 
six people, four Rorschach and Sentence Completion protocols and four Objective- 
type batteries (Thurstone Primary Mental Abilities, Kuder Preference Record and 
Guilford STDCR) was presented to twenty clinicians for matching. Judges were 
asked to match all documents in unequal series, to indicate degrees of subjective 
certainty, and to give reasons for matching. 

Judges as a group performed at a higher level than could have been expected by 
chance. Individual judges performed considerably above chance. Performance of 
judges was differential, that is, judges who were superior in matching with one test 
were superior in most. A skill in the analysis of such materials exists. 

The material in the biography, Rorschach and Thematic Apperception Test 
overlaps or corresponds to the extent that it may be matched with higher than 
chance expectancy, therefore these tests can be considered valid. Judges were able 
to identify tests correctly in series as belonging with a given biography. Without 
having been so instructed judges used the biography as the criterion. It was also 
possible to recognize subjects in terms of identifying tests as belonging to distinct 
persons when the biography or some other unit was not given. In expression of 
subjective certainty it was found that judges who were most often correct were un- 
willing to express confidence in their judgments. 

The importance of the ability of the judge in matching materials of this type is 
indicated. It can be inferred that there exists a diagnostic skill, that some judges 
with roughly equivalent experience possess it and can demonstrate it in greater de- 
gree. The matching technique seems utilizable to isolate those with such skill and 


2 err people with demonstrable diagnostic ability from among those entering 
the field. 
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APPENDIX* 


I. Method of computing probabilities that a given number of arrangements could 


occur by chance when an array (made up of 6 units) is matched against one or more 
other similar arrays. 


Chance probabilities when one person arranges an array comprising six 

units against one other similar array: 

(m Xt:t = 1X 6:6) 

Formulae: 

1) n (n-1) (n-2) 1=n: the number of different arrangements of 
n things that can occur. 

2) ! s n(n-1) (n-2) 

= C = 

s ! (n-s) ! n s(s-—1(s-2)....1(m-s) (n-s-1)1 


the number of different ways that it is pos- 
sible to take s things out of n things. 








*Notation: hereafter ‘m’ shall represent the number of arrays or sets of material presented; ‘t’ 
the number of things to be matched in any one array. For example (m X t:t = 4 X 6:6) shall 
mean that judges were required to match 4 arrays of material, each array comprising six units. 
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II. Chance expectancy when six things are arranged against another array of six 
things: 








Cumulative 
Number of 


0 ) 
arrangements Frequency Frequency 





0 44 265 . 3680 1.0000 


4 5 
9C-=45 44C-=264 3665 . 6320 
5 6 


3 4 
2C- = 9C-=135 














Total 2!=2; 3!=6; 4! = 24; 5! 








III. Chance frequencies computed by entering the frequency distribution where 
one array is matched against one other array and against these same frequencies 
(representative of the consideration of a third array): 








Number of 


% Frequency (one array against another) 
arrangements 





6 .0014 
.0000 
.0208 
.0555 
. 1875 
. 3665 
. 3680 








% frequency 
Number of arrangements 








IV. It is mathematically possible to consider the probabilities that a given number 
of successful matchings could occur by chance alone when one array is arranged 
against three other sets of data. (In our experiment, judge X is asked to arrange one 
set of materials (6 T.A.T.’s representative of 6 persons) and a second set of materials 
(6 Rorschachs similarly matched) and a third (6 P.M.A.’s) against a criterion set 
(6 biographies). 

Probabilities are arrived at by considering frequencies when two sets of material 
are arranged against a third, and entering the frequencies for another arrangement. 
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Number of % Frequency 
correct matchings 


12 
ll .0000 
10 ; 

9 





etc. 
.03353 .03339 .01708 
-.06558 .06531 .03342 
.10021 .09798 .05106 
.09925 .09885 .05057 
.04983 .04962 .02539 _ ete. 


% frequency .036800 .3665 .1875 .0555 .0280 .000 .0014 
No. of arrangements 0 1 2 3 4 5 6 





Cr NOOO 10 











V. It is possible to compute probabilities that a given number of correct matchings 
could occur when judge X arranges 6 sets of materials, each set made up of 6 units, 
against criteria: 








Number of Cumulative 
Correct Matchings % Frequency % Frequency 


-00002 .00002 
-00006 -00008 
.00015 -00023 
-00039 .00062 
.00099 .00161 
.00237 -00398 
.00539 .00937 
.01168 .02105 
.02325 .04430 
.04192 .08622 
.06852 . 15474 
. 10355 . 25829 
. 13738 . 39567 
- 16028 .55595 
. 16033 . 71628 
. 13264 - 84892 
.08912 . 93804 
.04454 - 98258 
.01484 . 99742 
.00248 . 99990 


Median number of matchings which could be expected to occur by chance: 6.387. 

















VI. In order to test whether the obtained distribution of scores (number of correct 
matchings) could have arisen on the basis of a random sampling from a computed 
theoretical distribution, it is necessary to: 

a) Consider only those 14 judges who completed the task (Judges A — N) 

b) Work out the statement of X? when the greatest number of correct ar- 
rangements can equal 18 (three arrays of data — Rorschach, T.A.T. and 
objective set) are considered to be matched against the criterion array 
(Biographical material). 

m Xt:t =3 X 6:6) 
CoMPARISON OF THEORETICAL AND OBTAINED MATCHINGS 
FOR THE GROUP 








Scores 
Incidence 0-3 4-18 


Theoretical 5.11 8.89 
Obtained 1.0 13.0 


X?=5.26 =p. <.05 




















THE DIFFERENTIATION OF WELL AND POORLY INTEGRATED 
CLINICIANS BY THE Q-SORT METHOD! 


JULIUS SEGAL? 
The Johns Hopkins University 


PROBLEM 


The problem of describing the successful clinical psychologist is an important 
one for purposes of vocational guidance as well as of trainee selection. The present 
study deals with a phase of this problem, specifically, with the following question: 
Which personality traits, among those studied, distinguish most clearly between 
well and poorly integrated clinicians in both general and vocational terms? Al- 
though, in seeking the answer to this question, no external criterion for clinician- 
performance was utilized, it was foreseen that some evidence as to the internal 
validity of the integration criterion studied here would be provided. 


METHOD 


Measures of integration were obtained by means of a Q-sort test of 100 items 
constructed by Butler and Haigh® from statements made by clients at the Uni- 
versity of Chicago Counseling Center.’ Butler and Haigh"? regarded the degree of 
congruence between the self concept and the ideal self concept as a measure of 
personality integration or adjustment.“? It is this integration criterion, i.e., the 
correlation between the self and ideal self concepts which is used here. The items 
comprising the Q-sorts are presented under results. 

Thirty-six clinical psychologists served as subjects, six clinicians at each of six 
levels of training and experience. These were first, second, third and fourth year 
students in the clinical psychology program at The Catholic University of America, 
a recent Ph.D. group whose degrees were from 2-14 months old, and a Ph.D. group 
whose degrees were received from 7-27 years ago. All subjects w were white males, 
and the groups were matched as evenly as possible in age, marital status and in- 
telligence. 

Subjects were tested at Catholic University and in the clinics and hospitals in 
which they were employed. Each subject was given a package of 3” by 5” cards on 
which were typed the items to be rated, together with arbitrarily assigned item 
‘numbers to be used for correlation purposes. A layout sheet, containing 11 boxes 
numbered from 0-10, was provided. The subject was told to place each card in one 
of the numbered boxes according to whether the statement contained was com- 
pletely applicable to him (rank 10), completely inapplicable (rank 0), or intermed- 
iate in applicability along the continuum. The only additional requirement was that 
each subject sort the cards so that they constituted a normal distribution; the clini- 
cians were told, in other words, how many cards were to be placed in each box. The 
use of 11 ranks for Q-sorts has become standard since the earlier studies of Stephen- 
son. ® 6) The forcing of a normal distribution is desirable as a statistical conven- 
ience since it might be, as noted by Hartley ©, that on a large sample of traits, items 
would fall into a fairly normal distribution even if unforced. 

In this study, four sorts, or self-ratings, were obtained from each subject; these 
were the ratings of the self concept (8), the ideal self concept (SI), the self concept as 
a clinical psychologist (P), and the ideal self concept as a clinical psychologist (PI). 


1This study is based upon a portion of the author’s Ph.D. dissertation. 

2The author is deeply indebted for the assistance given him by his chief advisor, Dr. Helen E. 
—— and by Dr. Claire M. Vernier and Dr. John W. Stafford of The Catholic University of 

merica 

*The Counseling Center at the University of Chicago kindly supplied the Q-sort items in May 
1951 in advance of public ation of the work of Butler and Haigh. Dr. Carl R. Rogers, in a personal 
communication dated 5 May 1954, informed the author that continuing studies based upon these 
items are in progress at the Univ ersity of Chicago, and that revised lists of items will probably be 
evolved in the light of the findings of ongoing research. 
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RESULTS 


Intercorrelations of all four sorts were computed for each subject. For purposes 
of this study, only two correlations are dealt with, those that serve as indices of 
integration. They are: the self vs. the ideal self (S-SI), the criterion for general in- 
tegration, and the self as a clinical psychologist vs. the ideal self as a clinical psycho- 
logist (P-PI), the criterion for vocational integration. 

In order to select the best and most poorly integrated psychologists in general 
terms, the upper and lower 27% of the distribution for the S-SI correlations were 
chosen; similar portions of the P-PI correlation distribution were chosen in order 
to select the best and most poorly integrated psychologists in vocational terms. 
The use of a 27°; sample at the extremes of a distribution as a basis for comparison 
is suggested by Kelley as the most meaningful sample of distribution limits. In 
this case, then, the upper and lower groups in terms of size of Q-sort r’s comprised 
10 clinicians in each. The 1 range of correlations for each group is as follows: High 
S-SI, .81 to .98; Low S-SI, —.25 to .59; High P-PI, .83 to .99; Low P-PI, .18 to .68.4 
Utilizing Fisher’s Z- transformation, mean correlations for these groups were ob- 
tained. They are: High S-SI, .95; Low S-SI, .60; High P-PI, .94; Low P-PI, .69. 

Means were computed for the ranks assigned to the 100 Q-sort items by the 
clinicians in each of the four groups, and the means of the high and low correlation 
groups were compared. A difference in mean rank of one point or more was con- 
sidered as significant since the Q-sort rating scale itself bears intervals between 
ranks of one digit; to study items whose differences in mean ranks were in fractions 
of a unit would be to imply a fineness of within rank meaning which, in fact, does not 
exist. The items which differentiated in this manner between the upper and lower 
correlations groups are noted in the list of Q-sort items presented below. The range 
of differences for these discriminating items is from 1.0 - 2.4; with few exceptions 
these are not significant statistically at the <.05 level. 

The clinicians having the highest S-SI correlations (best integrated in general 
terms) differed with the clinicians having the lowest S-SI correlations (most poorly 
integrated in general terms) on a total of 34 items for the self concept (S) sort. On 
the average, the high group ranked as significantly more applicable to themselves 
items describing them as warm in their relationship with others, having self control, 
liking people, poised, tolerant, satisfied with themselves, intelligent, good mixers, 
having initiative, taking a positive attitude toward themselves, being hard work- 
ers, feeling adequate, emotionally mature,’ responsible for their troubles, expressing 
their emotions freely, being decisive, rational, dominant, and assertive. In the reverse 
direction, the low S-SI correlation group ranked as significantly more applicable 
to themselves items describing them as confused, having feelings of not contributing 
enough to life, not being responsible for their own decisions, needing to know how 
they impress others, not facing things, shrinking from crises and difficulties, not 
respecting themselves, making strong demands on themselves, having a feeling of hope- 
lessness, being self-centered, having their hardest battles with themselves, being afraid of 
disagreements, being indecisive, inhibited, and having to rationalize and excuse their 
behavior. 

Turning to the correlations between the self concept as a psychologist and the 
ideal self as a psychologist, it is found that the clinicians having the highest P-PI 
correlations (best integrated in vocational terms) differed with the group having the 
lowest P-PI correlations (most poorly integrated in vocational terms) on a total of 
30 items for the self-as-psychologist (P) sort. The high group ranked as more ap- 
plicable to themselves items describing them as having a warm relationship with 


‘In cases in which two or more subjects had identical correlations at the lower margin of the upper 
group or the upper margin of the lower group, the final choice of subjects to be included in the terminal 
distributions was made by random number choice. 

‘Italicized items are those which are found to differentiate the upper and lower S-SI groups only, 
i.e., on which no differences were found in contrasting the best and most poorly integrated clinician 
in vocational terms. 
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TaBLe 1. List or Q-Sort Irems. IN PARENTHESES FoLLOwING Eacu ITEM ARE GivEN Letters §, §,I, 
P anv PI INDICATING SIGNIFICANT DISCRIMINATION BETWEEN THE UPPER AND LOWER CORRELATION 
Groups on THE §, SI, P anp PI Sorts Respectivety. WHERE No Letrer 1s GIVEN, THE ITEM Dip 
Nor DiscrmMInaTE SIGNIFICANTLY ON ANY OF THE SoRTs. 








ie 


~] 


Pell arden’ 
who 


a 
SH ON OOP 


I feel uncomfortable while talking with 
someone. 

I put on a false front. 

I am a competitive person. 

I make strong demands on myself. (S) 


. I often kick myself for the things I do. 


I often feel humiliated. 


. Iam much like the opposite sex. 
. I have a warm emotional relationship with 


others. (S, P) 


. Iam an aloof, reserved person. 

. lam responsible for my troubles. (S) 
. [ama responsible 
. I have a feeling of hopelessness. (S) 

. I live largely by other people’s values and 


rson. (PI) 


standards. 


. I can accept most social values and stand- 


ards. 


. I have few values and standards of my own. 
. It’s difficult to control my aggression. 

. Self control is no problem to me. (S, P) 

. Iam often down in the dumps. 

. I am really self-centered. (S) 

. I usually like people. (8S, P) 

. I express my emotions freely. (S) 

. Usually in a mob of people I feel a little bit 


alone. 


. I want to fy” up trying to cope with the 


world. (P, 


. I can live comfortably with the people 


around me. 


. My hardest battles are with myself. (S) 


. I tend to be on 


rd with people who are 
somewhat more friendly than I expected. 


. Iam optimistic. 
. Tam just sort of stubborn. 
. Lam critical of people. 


. I usually feel driven. 


. Iam liked by most people who know me. 


. I have an underlying feelin 


that I’m not 
contributing enough to life. (S, P) 


. I feel helpless. 
. I can usually make up my mind and stick 


to it. (S) 


. My decisions are not my own. (S, P) 


. I often feel guilty. 
. Tam a hostile 

. Iam contented. 
. I am disor 
. I feel apathetic. 

. Iam poised. (S, P) 


rson. 


nized. 


2. I just have to drive myself to get things 


done. 


. 1 often feel resentful. 
. I am impulsive. 
. It’s important for me to know how I seem 


to others. (S, P) 


. I don’t trust my emotions. (P) 
. It’s pretty tough to be me. 

. Tama rational 
. I have a feeling I’m just not facing things. 


(S, P) 
. Iam tolerant. (S, P) 


rson. (S) 
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60. 
61. 


62. 
63. 
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. Tam afraid of full-fledged disagreement with 


I try not to think about my problems. 

(SI, P, PI) 

I have an attractive personality. (P) 

I am shy. 

I need somebody to push me through on 

things. 

I feel inferior. (P) 

Iam no one. Nothing really seems to be me. 

I am afraid of what other people think of 

me. (P) 

I am ambitious. 

I despise myself. 

I have initiative. (S, SI, P, PI) 

I — from facing a crisis or difficulty. 
> 

9 ) 

I just don’t respect myself. (S, P) 

I am a dominant person. (S) 

I take a positive attitude toward myself. 


I am assertive. (S = 


a person. (S) 


. I can’t seem to make up my mind one way 


or another. (S) 


. Iam confused. (S, P) 

. Iam satisfied with myself. (S, P) 

. Iam a failure. 

. Iam likable. 

. My personality is attractive to the opposite 


sex. 


. Ihave a horror of failing in anything I want 


to accomplish. (SI, P, PI) 


bie a relaxed and nothing really bothers me. 
(P) 
. lama hard worker. (8, P) 


. I feel emotionally mature. (S) 


. 1am afraid of sex. 


. Iam naturally nervous. 
. I really am disturbed. 
. All you have to do is just insist with me and 


I give in. 


. I feel insecure within myself. (P) 


. Iam a submissive 
. I am intelligent. (S, P) 

. I feel superior. 

. I feel hopeless. 

. I am self reliant. 

. I often feel aggressive. 

. I am inhibited. (S) 

. I am different from others. (P) 
. I am unreliable. 

. I understand myself. (P, PI) 

. Iam a good mixer. (S, Pp 

. I feel adequate. (S, P, PI) 

. I am worthless. 

. I dislike my own sexuality. 

. Iam not accomplishing. 

. I doubt my sexual 
. I am sexually attractive. (PI) 

. I have a hard time controlling my sexual 


. I have to protect myself with excuses, with 


rationalizing. (S) 
rson. 


powers. 


desires. 
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others, having self control, initiative, liking people, being poised, tolerant, satisfied 
with themselves, hard workers, intelligent, good mixers, taking a positive attitude 
toward themselves, feeling adequate, relaxed and unharassed,* having an attractive 
personality, understanding themselves, and being different from others. The low P-PI 
correlation group, on the other hand, ranked as more true of themselves items 
describing them as having a feeling of not contributing enough to life, not facing 
things, not making their own decisions, needing to know how they seem to others, 
being confused, shrinking from crises and difficulties, not respecting themselves, 
having a horror of failing in goals set for themselves, wanting to give up trying to cope 
with the world, feeling inferior, insecure, not trusting their own emotions, trying not to 
think about their own problems, and being afraid of others’ opinions about them. 

With respect to the ideal self sorts (SI and PI), differences in mean rank of 1.0 
or more between the upper and lower correlation groups were found for only 11 
items, three on the SI sort and eight on the PI sort. These items can be identified 
from Table 1. 


Discussion 


It is noteworthy that those items for which differences were found between the 
terminal distributions of both the S-SI and P-PI correlation groups retain the 
identical direction of difference in each case. No statistical analyses of the items 
differentiating the best and least well adjusted clinicians were made; it is fairly clear, 
however, that these groups are differentiated mainly in terms of traits descriptive of 
their overall personal adequacy and adjustment, and of the quality of their inter- 
personal relationships. 

Of special interest are the items differentiating the upper and lower P-PI 
groups. In vocational terms, the best adjusted clinicians feel more strongly that 
they understand themselves and feel relaxed and unharassed in their work. In 
contrast, the least well adjusted clinicians describe themselves more strongly voca- 
tionally as feeling inferior and insecure. The crucial difference between the two 
groups as clinicians, it seems, lies in their degree of self acceptance and understand- 
ing, traits which would appear to be of prime importance for satisfactory adjust- 
ment as clinical psychologists. 

The self ideal sorts (SI and PI) did not yield striking differences between groups. 
This finding (as well as other aspects of the present study to be reported in sub- 
sequent papers) points to the fact that differences in integration were not based on 
qualities of the ideal self concept, but rather on the manner in which the self is per- 
ceived. The ideals of both the best and most poorly adjusted clinicians are virtually 
similar; the striking differences lie in their evaluations of themselves. 

The data presented here indicate clearly that the Q-sort items were psycholog- 
ically valid in meaning for the psychologists tested. The correlation between the 
self and ideal self concept in both general and vocational terms appears to have con- 
siderable validity internally in describing degree of integration, and should be tested 
against suitable external criteria. 


SUMMARY 


This study was designed to identify those traits (from among 100 Q-sort items) 
which most clearly distinguish between groups of well and poorly integrated clinical 
psychologists, in both general and vocational terms. The groups were selected from 
among 36 clinicians at various levels of training and experience on the basis of the 
degree of correlation between the self and ideal self concepts. The results indicate: 


1. The best and most poorly integrated clinicians are differentiated mainly 
in terms of traits descriptive of overall personal adequacy and adjustment; as clin- 


*The italicized items are those which were found to differentiate only the upper and lower P-PI 
groups, i.e., items on which differences were not found in contrasting the best and most poorly in- 
tegrated psychologists in general terms. 
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icians, i.e., in a vocational sense, the significant differences in the two groups lie in 
their degree of self-acceptance and understanding. 

2. The ideal self concepts of both the best and most poorly integrated clinicians 
are virtually similar; the crucial differences lie in their evaluations of themselves. 

3. The correlation between the self and ideal self concepts in general as well as 
in vocational terms appears to have considerable internal validity as a measure of 
degree of integration, and should be tested against external criteria. 
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AN ANALYSIS OF SOME CLINICAL JUDGMENTS ON MALE BASIC 
AIRMEN WHO FAILED THE GROUP PSYCHOLOGICAL TESTS* 


MILTON B, JENSEN AND JOHN SCHMID 


Veterans Administration Hospital, Personnel and Training Research Center, 
Salisbury, N.C. Lackland Air Force Base 


PROBLEM AND METHOD 


In this study an attempt has been made to analyze the factors involved in 
psychological evaluation in a clinical setting in the Air Force. The subjects eval- 
uated were 660 male basic airmen, who failed the group psychological tests (AC-1B 
& AFQT), thereby becoming eligible for discharge for failure to meet minimum men- 
tal standards for retention. Clinical evaluations supplied the basis for reeommenda- 
tion, either for retention in or discharge from the Air Force, a recommendation in- 
variably followed during the period from July through. October, 1952 with which 
the study is concerned. 

Data for this period were chosen because (a) by July, 1952 the clinical exam- 
iners had received several months training and experience under the direct super- 
vision of the senior writer, and (b) the July-October period provided populations 
adequate for statistical treatment—100 recommended for discharge and 560 re- 
commended for retention. Complete data for a longer period of examination are 
given in two multigraphed publications by the senior writer from Lackland Air 
Force Base, San Antonio, Texas. ® *) 


*This study was completed while the senior writer was on active duty with the Air Force at Lack- 
land Air Force Base. The opinions and conclusions contained in this paper are those of the writers. 
They are not to be construed as necessarily reflecting the views or the endorsement of the Air Force. 

The authors gratefully acknowledge the assistance given by the following airmen who aided 
materially in completion of this study: S/Sgt. James E. Henry, A/1C Gerald R. Zigler and A/2C 
Lewis P. Lipsitt, who served as examiners, and S/Sgt. Robert Wheeler who performed the factor 


analyses. Drs. John M. Leiman and Michael A. Zaccaria kindly gave constructive criticisms and 
suggestions. 
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Failure on the AC-1B (Airman Classification Battery) for this period was Index 
(Stanine) 3 or lower on all categories with the exception of the Services Index (No. 
7). For the AFQT (Air Force Qualification Test) it was score below percentile 10. 
Failure on the AC-1B followed by failure on the AFQT led to individual (clinical) 
psychological evaluation. 

The examiners through whom the basic data were secured were airmen selected 
and trained by the senior writer. Though five examiners were employed with the 
subjects of this study most examinations were completed by two of them. Academic 
backgrounds of the examiners varied from high school graduation to near-completion 
of doctoral training in clinical psychology. The senior writer evaluated the assembled 
data on each subject, usually in conference with the two or more examiners who had 
participated in the examination. An over-all evaluation in terms of fitness for Air 
Force duty was made. In doubtful cases or where data were conflicting, the writer 
examined the subjects further, interviewing and /or testing as necessary. Following 
Air Force directive, subjects whose value or lack of value to the Air Force appeared 
doubtful, were retained in service. Of the 560 retained about three per cent (3%) 
were so considered. Each examiner concerned with each subject, independently 
rated that subject for retention or discharge, or as of questionable suitability. Differ- 
ent examiners were responsible for different phases of the examination in such a 
manner that each subject was judged by at least two examiners. Observations of 
behavior during testing and interviewing were recorded by the examiners. 


VALIDITY AND RELIABILITY OF EVALUATIONS 


Some evidence of validity of these evaluations is attested by the small percentage 
of men who failed to complete basic training. Only about four percent of the men 
retained on the basis of these recommendations failed to complete the basic training 
program required for all airmen. This is particularly significant in view of the pre- 
vailing conception that men who fail the group psychological examinations are unfit 
for military duty. Obviously we do not know how well those discharged on the basis 
of our recommendations would have done in basic training had they been retained 
in service. 

There is evidence that our over-all evaluations of these 660 subjects are relative- 
ly stable (reliable). Six months after completion of the evaluations, two of the 
examiners independently re-evaluated the subjects from the case records alone and 
without knowledge of the previous (initial) evaluations. The correlation (phi) of 
their re-ratings as to retain or discharge is .76. Their re-ratings correlated with the 
initial evaluations .72 and .77. Since the evaluations originally were arrived at in 
conference of always three or more judges we estimate reliability of evaluation as 
represented by a coefficient of about .91. 

We do not have adequate or even supportable measures of reliability for the 
tests and ratings comprising the basic data for this study. Reliability coefficients 
reported by test authors are not applicable to our sample because it represents a 
group of men who are massed in the lower ranges of academic proficiency as well as 
in many aspects of intellectuality. We were unable in this situation to work sufficient- 
ly independently to secure the measures of consistency desired. Despite these limit- 
ations we feel that our data have meaning and throw considerable light on the mental 
processes of the senior writer and his assistants in arriving at a decision as to whether 
male basic airmen who fail the Air Force group psychological tests should be retained 
in service under conditions existing during the period under investigation. We think 
other clinicians would operate similarly, even under considerably different conditions. 

During this particular period military manpower requirements were relatively 
constant. Possibly, as we followed the success in basic training of those recommended 
for retention we became less prone to recommend discharge. At the writing of this 
paper (February 1954) personnel needs of the Air Force are not as acute as in 1952. 
Undoubtedly we would now recommend discharge more freely. However, we doubt 
that our 1954 psychological processes of evaluation differ radically from those of 
1952—except as, we hope, we now are more skillful. 
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REGRESSION ANALYSIS 


Seventeen variables are considered in this study. Their intercorrelations (Table 
2) are examined by means of multiple correlation and factor analysis techniques. 
The variables are listed in Table 1, together with some evidence of the narrow ranges 
of the data which, obviously, influence our results. 

Hereafter, ‘‘recommendation”’ (variable 17) is called the criterion. Validness of 
these recommendations is given only superficial treatment in this study. A long- 
term, controlled investigation of the degree to which these evaluations predict suc- 
cess and failure in the Air Force is impossible under existing conditions. We are here 
concerned with some of the mental processes by which these clinical judgments are 


made and, to some extent with how the writers and their associates view those pro- 
cesses. 


TABLE 1. VARIABLES UsED IN ANALYSIS OF CLINICAL EVALUATIONS OF 660 MALE Basic AIRMEN WHO 
FaILeD THE Group PsycHoLoaicaL Tests (AC-1B anp AFQT).* 








, Zo of 7% 0 
Tabulation and Test Variables Total Retained Discharged 





Race (plus correlation favors Caucasian) 
Caucasian 45 80 20 
Negro 55 92 8 


Community of origin 


ity 43 85 15 
Rural and small town 57 85 15 





Mean 8. D. 





Ammons Full Range Picture 
Vocabulary Age 11.1 1.8 


Porteus IQ 
Porteus Q-Score : 0-196 


Moore Eye-Hand Coordination 
(Speed in Seconds) 4 76-186 


Jastak-Bijou Wide Range 
Achievement Reading Grade : 1.0-12.3 


Jastak-Bijou Wide Range Achieve- 
ment Arithmetic Grade 5.0 1.0-9.0 








Ratings by Examiners (Based on 130-item structured case history, other interview data, behavior 
in examination and “S” psychiatric profile with descriptive categories and ratings by the senior 
writer. Profiling follows the procedures of AF Regulation 160-13A, where “‘1” indicates no defect, 
“2” mild defect, “3”? moderate defect and ‘4’’ severe or totally disqualifyi ing defect. Per cents 
below are for the total population of 660.) 
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*The variables listed are henceforth in this paper referred to by numbers given here. 
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TABLE 2. INTERCORRELATIONS AND REGRESSION Sratistics! 
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35 -08 -03 
60 14 08 


64 11 
69 31 
61 70 


07 
21 
43 








‘Decimal points omitted 
*Beta weights 
’Per cent contribution to variance of criterion, variable 17. 


Multiple R? = 
Multiple R = 


.74 
. 86 


Our first major step in examination of these evaluations was by intercorrelation 
techniques. The intercorrelations for the 17 variables are given in Table 2, together 
with beta weights and the per cent of contribution of each of the first 16 variables 
(1, 2, 3, ... 16) to the total variance of the criterion (17). We recognize the limitation 
of this method with these data. We sought more appropriate methods; however, it 
was felt that intercorrelation techniques would furnish additional insight into the 
nature of these airmen and our methods for evaluating them. 


INTERPRETATIONS 


Examination of Table 2 reveals among other things that the 74 per cent of the 
variability of the criterion may be attributed to variation in the other 16 variables. 
In addition, when all the variables are considered together, variables 13, 14, 15, and 
16 contributed most of the variability of the criterion. However, the high intercor- 
relations of all the variables do not preclude some other set of four being equally as 
efficacious in this respect. We consider R of .86 as high as is reasonable considering 
the nature of the data, particularly as regards reliability (or unreliability). However, 
these observations must be interpreted with extreme caution since all of the sub- 
jective ratings (variables 9-17) were made with knowledge of the more objective 
data (variables 1-8). We do not know what these ratings would have been had the 
other data not been available. It will be noted that all the ratings, including the 
criterion, are highly intercorrelated. Also, that most of them are significantly cor- 
related with the variables generally considered objective. Our problem is further 
complicated by the failure of many of our subjects to reveal their actual abilities 
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readily on objective tests, even under clinical conditions, @ »- 5 %) We are reason- 
ably certain that not all the data with which we concerned ourselves in making these 
evaluations are useful or necessary. Certain of them may even detract from sound- 
ness of judgment by confusing the irrelevant with the relevant. In this military 
situation it was impossible to examine each subject, or many subjects by an exper- 
ienced clinician. Had this been possible, examinations would have been abbreviated 
for many subjects—carried only as far as the clinician considered essential to valid 
a, Even so, we might have contradictions of data similar to those reported 
y Carp.” 

The three examiners independently judged the 16 variables as to the relative 
importance of each variable in terms of its concomitant independent variation with 
the criterion. The examiners agreed to the extent of rank difference correlations .27, 
.39, and .42. Although these correlations approach significance, the low order of 
agreement cannot reasonably be credited to ineptness. As psychological evaluators 
in this situation the examiners were carefully trained and supervised. Their diligence 
and devotion to their work left nothing to be desired. They are of outstanding mental 
ability. One examiner received an AM degree in clinical psychology prior to enlist- 
ment and is a member of the American Psychological Association. There is no evi- 
dence to show that he did better than did the two examiners lacking such training. 

From the data above it appears obvious that the writer and his associates knew 
very little about the mental processes they employed in arriving at clinical judgments 
in the situation described. One wonders whether other clinicians are more astute in 
this respect. The findings of Carp“ lend some support to a conclusion that they are 
not very different in this respect. She found high inter-rator reliability of judgment, 
or measure of “constriction” in children yet very little agreement among various 
measures of “‘constriction.’”’ Unfortunately she used an ill-defined term. It (con- 
striction) may not be a specific characteristic of personality, or it may be specific 
to certain situations (tests, examiners, etc.) and not to others. We do not know how 


well these same clinicians and /or tests would agree on an over-all or more definitive 
evaluation of the same children. In any event, we may not summarily dismiss the 
judgments of clinicians as invalid simply because they do not know how they func- 
tion. This probably is true of most, if not of all technicians to a degree not commonly 
suspected. One wonders whether surgeons, for example, do in the operating room 
exactly as they think they do or as they teach should be done. 


Factor ANALYSIS AND INTERPRETATIONS* 


It has long been a contention of the senior writer that clinicians, and particularly 
psychologists, use mental processes much less complicated than they suppose or 
than is generally claimed. Our regression analysis indicated that this is the case with 
us, or that few characteristics were susceptible of rating in this situation. Despite 
the use of 16 variables in making a judgment to retain, or to discharge a man as unfit 
for military duty, it seems that only four factors underly the judgments. The writers 
had thought that perhaps only one characteristic was being measured and were 
surprised to find that the data suggested four factors. Since factor analysis is a 
mathematical process which may be used to examine this hypothesis, by factoring 
the correlation matrix of the rating variables (9-17) shown in Table 3, it is possible 
to determine if four factors reasonably account for the nine judgments. 

Accordingly four factors were removed from the intercorrelation matrix and the 
residual correlations were examined. These residual correlations appear in Table 3. 
If four factors are sufficient to account for the common factor variability of the 


*This analysis is only of the subjective ratings—variables 9 - 17, Table 3. Factor analyses of the 
entire data resulted in no definitive or clean-cut factor structure. When we considered all the variables 
in the factor analysis, we found nothing promising or even suggestive of what could be done. The 
residual correlations remaining when factors other than those removed in the rating data suggested no 
solution. Ordinarily when this happens in a study, little more can be said than that new reference 
variables should replace some of those used in the study. Apparently, the various objective tests used 
in this study have no clean-cut factor structure with these kinds of airmen. 
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TABLE 3. INTERCORRELATIONS OF RATINGS ON VARIABLES 9-17 (UPPER HALF oF TABLE), AND 
ResipuaL CORRELATIONS (LOWER HALF oF TABLE). 








Variable 9 10 11 12 13 14 15 16 

9 47 66 68 60 55 

10 36 46 46 47 48 

11 5 é 41 43 41 39 

12 ¢ 30 46 41 44 

66 58 68 

005 73 64 

-012 009 65 

O11 -013 -010 039 

-067 051 022 068 007 











Residual r: Mean: -.0003 8. D.: .029 


nine variables, then the residuals should be near zero. An examination of Table 3 
shows that this is the case. Provided that the variance contributed by these four 
factors to the total variance of these nine judgments is sufficiently high, that is ap- 
proaches the reliability of the tests, then we are in a position to assert that our 
hypothesis of only four characteristics actually being rated is sustained. The per 
cent contribution of the four factors by variable in the column headed ‘““Commun- 
ality”’ is shown in the factor analysis findings, Table 4. Since we do not have re- 
liability ratings-for each of the nine variables, we can not judge whether these com- 
munalities (percentages) do approach the reliability. However, as indicated pre- 
viously, the rate-rerate correlation of the criterion is of the order of .72 - .77, and 
since it may be expected that this will be higher than for any of the other ratings, 
we have reason to believe there are no small specific factors being measured. The 
possible exceptions might be for variables 10, 11, and 12. However, the writers be- 
lieve that, even for these variables, the specificity is negligible. Consequently; we 
conclude that only four factors are present in these nine ratings. 


TasLe 4. Bi-Factor SoLuTION 








Group Communality 

Factors 
II III h? 
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In order to gain some insight as to how these nine variables cluster together into 
the four groups, the factor analysis is put in a form shown in Table 4. The four 
factors measured are indicated under the columns headed ‘‘Group Factors.” Several 
variables are considered as indicating a factor when their factor loadings (the numeri- 
cal values found opposite each variable for each factor) are non-zero. We consider 
values of .20 and above as being non-zero. Hypothesizing the nature of the factors 
is done by making an “educated guess” of what is common to the variables having 
non-zero loadings on each factor. Proceeding in this manner, we believe that the four 
factors are best characterized as follows: 


Factor I—Acceptance and enjoyment of people by the non-delinquent. We are 
convinced that in evaluating these airmen we clustered the non-delinquent who is 
socially empathic, rejecting those with histories of delinquency, even though their 
feelings for people were hygienic. 


Factor II—Largely a measure of intolerance for the conflict between aversion 
for traditional schooling and pressures to continue and work at that schooling. Asa 
group the individuals of our sample were inept academically but not necessarily 
because of lack of ability. The negative loading for the criterion poses a problem 
which should be explored. There are several hypotheses. We are dealing with a 
group of low academic achievement. Perhaps those of this group who rebelled 
against academic and family pressures were considered better risks in terms of 
emotional adjustment and as better motivated for the Air Force. Personality de- 
velopment may be retarded by passive-aggressive reactions to pressures. We be- 
lieve that we were influenced favorably by independence of thought and action of 
these individuals. They were averse to schooling. They resisted learning. They 
rebelled against the family and social pressures for continuation of schooling. As we 
look back on our work with them, they never believed that they were stupid. They 
did not seek escape through delinquent behavior. They quit school for work or the 
Air Force at the first propitious—not always the first possible—moment. The 
senior writer described some of this, very tentatively, in a previous publication. © 


Factor IIJ—Emotional adjustment and observed behavior in the examination. 


Factor 1V—Motivation for Air Force duty and Air Force adjustment; it seems 
to us are so largely self explanatory as to not merit extensive discussion in this paper. 

The general factor appearing in Table 4 is a special construct for this solution 
which enables us to say that the group factors are uncorrelated. The general factor 
may be considered as a construct which underlies the correlations and is common to 
all variables. This may be a rating factor common to all variables. 


TaBLeE 5. PERCENTAGE CONTRIBUTION OF Factors To TOTAL VARIANCE OF 
THE NINE RatTINGS 








I Common-Factor Variance 62.5 
g 50.0 
I 4.3 
II 2.6 
III 2.7 
IV 2.9 


II Unreliability and Specific 
Factor Variance 37.4 








The percentage contribution of the factors to the total variance of the nine 
ratings is shown in Table 5. It is seen that 62.5 per cent of the total variance is ac- 
counted for by common factors, whereas its complement 37.4 per cent, may be at- 
tributed to unreliability and small or insignificant and specific factors. The common 
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factor variance for each of the variables shows that fifty per cent of the total var- 
iance of the ratings is measured by the general factor. 

What is of interest in this factor analysis is that the criterion or final reeommend- 
ation for retention or discharge has loadings on Factors II and IV, and possible Fact- 
or III. An examination of Factor II shows that the criterion is negatively related 
to what is commonly thought of as school and family normal adjustment. An ex- 
amination of Factor IV shows that the criterion is positively related to motivation 
for Air Force duty and to present Air Force adjustment. An examination of Factor 
III suggests that the criterion is also positively related to the emotional adjustment 
and behavior shown in the examination. 


CONCLUSIONS 


These findings suggest that perhaps more reliable and valid ratings might be 
developed through careful consideration of these factors for such individuals and a 
normal expectancy is that the more validly and reliably these factors are measured, 
the sounder will be the recommendations for retention or discharge. Whether Factors 
II and IV or the suggested Factor III, such as are found in our sample, adequately 
cover the personalities of individuals is a question. Perhaps the recommendations 
should involve not only Factor I, but also other dimensions of personality which 
may have relevance for military performance, such as perseverance, willingness to 
follow directions, etc. 

In making these interpretations much caution, of course, should be exercised. 
These factors are hypothetical constructs developed for the purpose of making some 
sense among consistencies and inconsistencies of the ratings. The factor analyses 
might be thrown into a form which would make some other rationale equally plaus- 
ible. The findings presented here may be only one of several such rationales. It 
should be noted that the analysis was performed by a fairly objective technique. 
The writers feel that any other research worker would have found this same solution. 


We are projecting other studies in this area. We feel that the hypotheses deduced 
from the factor analyses can be subjected to experimental examination. 
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INDIVIDUAL BIAS IN PSYCHOLOGICAL REPORTS 
JAMES T. ROBINSON AND LOUIS D. COHEN 
Duke University! 


PROBLEM 


There has been considerable study of the relationship between the examiner 
and subject during conventional test administration and of the influence of the ex- 
aminer upon the subject’s responses in a variety of testing situations @: 5 8. 1°, 11, 18, 14), 
Individual differences in examiners processing data secured from the use of pro- 
jective techniques have also been studied: ®) and reports are beginning to appear 
about the influence of the personality of the examiner on his reports about sub- 
jects“). The present paper is a report of a study designed to examine the systematic 
variation in the content of psychologists’ reports and their individual biases. 


THE SUBJECTS 


The subjects were three graduate students in psychology who were serving their 
internship training year at Duke Hospital a few years ago. Each of the students 
rotated through three services at the hospital and thus had an equal opportunity to 
see patients on the in-patient, out-patient and psychosomatic services. The psycho- 
logical case reports of each student were collected for the year, and the last 30 pre- 
pared by each of the 3 students were set aside for analysis. There is no reason to be- 
lieve that any systematic bias infiuenced the types of patients the students exam- 
ined since assignments to the study of patients was on a rotational basis. Examina- 
tion of the diagnostic categories suggested that no noteworthy differences existed 
among the groups of patients examined by the three students. 


PROCEDURE 


The psychological reports on patients included one section in which the students 
described their interpretation of the patient’s problems, the patient’s methods of 
dealing with his problems, his particular assets and liabilities, and another section 
which summed up briefly in a statement of a general nature the description of the 
patient’s illness and his manner of dealing with it. These two sections of the reports 
were the subject matter for the present study. The psychological reports were gen- 
erally of moderate length, about a page and a half of single spaced typewritten copy. 

Since all the students were under the supervision of the same staff psychologist, 
the modifications in the psychological reports due to his influence in reviewing the 
case may be considered equal. Reports prepared in the latter part of the students’ 
training were selected in order to diminish this influence since relatively more inde- 
pendence of report was allowed internes in the latter stages of training. 

The method of analysis of the content of reports was first to identify the phrase 
units. In this regard, we followed the suggestion of Dollard and Mowrer“ that the 
most serviceable unit seems to be what the grammarians call the independent clause. 
Each of the reports was subdivided into its composite independent clauses on the 
basis of a set of instructions that were carefully elaborated “?. 

The independent clauses were classified according to the description proposed by 
Aron“) for use with the TAT. Of the complete range of variables which Aron identi- 
fies, dependence, independence, aggression, and abasement were used for the major 
classification. The dependence variable was subdivided into the variables succor- 
ance and deference-compliance. The variable independence was subdivided into 
autonomy, seclusion, recognition and independent-mature. The variable aggression 
was first subdivided into physical, passive and verbal but these were later recombined. 
The variable abasement was further defined as intraggression. 
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In order to classify certain vague but general descriptions about dependence 
and independence used in the psychological reports, two variables—dependence- 
general and independence-general were added to the Aron listing. Thus, a total of 
twelve behavioral classifications were specified and carefully defined. An intensity 
dimension was introduced to deal with the problem of emphasis, lack of emphasis 
and de-emphasis, and applied to each phrase. 

All results were converted into percentages in order to make possible compari- 
sons among records of unequal length. The number of phrases identified served as 
the denominator for the four major-variables. For the minor variables, the number 
of phrases for the major variable served as the denominator. 

The reliability of the method of assessing psychological reports was tested by 
two independent judges, each of whom analyzed the same 30 reports. Three practice 
sessions preceded the actual test of the method, which yielded 25 separate reliability 
coefficients ranging from .68 to 1.00, with a median reliability of .94 on the number of 
scored units, the classification of units, and on intensity level. 


RESULTS 
In each case, psychologist A was compared with psychologist B, psychologist A 
with psychologist C, and psychologist B with psychologist C. As will be noted in 
Table 1, in the comparisons made on the major variables dependence, independence, 


TABLE 1. RELATIVE RANK AND RELIABILITY OF DIFFERENCES IN THE USE OF CLASSIFIED DESCRIPTIVE 
PHRASES BY THREE PsYCHOLOGICAL INTERNES ON 30 SEPARATE REPORTS 








Ratines SuBJECTS 





P 





Major Variables 





Dependence 





Independence 
Aggression 
Abasement 


Minor Variables 














(Dependence) 





Succorance 
Def. Comp. 
Dep. Gen. 











(Independence) 





Autonomy 1 





Seclusion 





Recognition 








Ind-General 

















3 

2 

Ind-Mature 2 
3 














_ Note: Relative rank is indicated by the figures 1 = high, 2 = medium, 3 = low. Subjects in- 
dicated by letters A, B, C. 


*p<.05>.01 
**p <.01 
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aggression and abasement, Subject A differed reliably from B but not from C in 
emphasizing aggression and failing to make statements about abasement. B and C 
differed reliably from each other in that B emphasized dependence and used the 
fewest statements about independence. B also made more comments about depend- 
ence than did Subject A. In the comparisons among the minor variables, reliable 
differences were found among the subjects in the extent to which they emphasized 
one or another variable in their reports about patients. 

With regard to intensity, only five of 36 comparisons yielded reliable differences 
and no consistent trend of differences among the subjects was noted. 


DISCUSSION 


Before considering the meaning of these results, some obvious limitations of the 
approach used in this study should be emphasized. It would have been better to use 
the same patients for each of the subjects, but this was obviously net possible with- 
out introducing other difficulties. An alternate method would be to use the test pro- 
tocol as the sole source of data, but this is not typical clinical practice and such a 
procedure would limit or eliminate interaction data. Thus, no easy way to insure 
similar patients comes to mind. 

The use of only four of the major scoring categories of Aron raises a question 
about the types of error that might arise in the comparison of proportions. For 
example, if for subject A, the four variables were discussed for 20% of his report and 
for subject B, the four variables were discussed for 80% of his report and if subject 
A discussed one variable for 100% of his scored part of the report, he would not use 
as much proportionately of his total report as subject B who might discuss one var- 
iable for 50°% of his scored part of the report. Yet in a comparison of proportions, A 
would exceed B by 50%. For this reason, it would seem that the results from the 
major variables may be considered only suggestive. 

However, within the classification of each major variable, the percentages of 
minor variables would not suffer from this difficulty since the denominator is the 
number of entries for the major variable and is stable. In studying these minor var- 
iables, we note in Table I reliable differences among the subjects in their emphasis 
upon one or another variable. It would thus seem that despite these cautions, re- 
liable differences do appear in the study of the minor variables and are probably also 
reliable for the major variables. 

It is planned to use data derived from Rorschach and individual psychoanalytic 
sessions to study further the relationship between the personality of the subjects and 
their reports about their patients. This may be extremely helpful in understanding 
the nature of the systematic bias found in this study. 

These findings tend to re-emphasize“) the presence of individual bias in the 
reports on the personality of patients, thus raising a serious question about the ob- 
jectivity of methods of evaluation and prediction if they must rely solely upon psy- 
chological reports for their basis. This is, of course, an ancient problem with psycho- 
logists—one, as Boring“ has indicated, that goes back to the earliest problems of 
individual differences in reaction time in astronomical reporting. 

This study has tended to make us much less sanguine about the nature of our 
reports, and has had a salutary effect upon the definition of the role of the supervisor 
in the training of students working with projective techniques. It becomes apparent, 
too, that it is necessary for the student to be aware of his systematic biases as they 
may influence both the interpretation of data and his reporting on patients. The 
supervisor has the responsibility for helping the student see to what extent he is 
selectively perceiving and selectively reporting aspects of his patient’s behavior. 


SUMMARY 
The present study attempted to investigate the presence of individual biases on 


the reporting of the personality of patients, using three internes in psychology who 
had, over a year’s period, studied 30 patients each in a rotating service. The records 
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were analyzed using a revision of the Aron method of TAT analysis. Differences 
among the three psychologists’ reports were studied, and pronounced and reliable 
differences among all three were found. It is concluded that systematic individual 
biases exist in the reporting of patients’ personalities and may be related to the in- 
dividual personality of the reporter in a systematic fashion. It was recognized that 
(1) these findings raise a serious question about the use of psychological reports for 
the evaluation or prediction of behavior, and (2) that considerable advantage may 
accrue from studies such as this when related to the teaching of students and to 
pointing out the role of individual biases in the reporting by students of the problems 
of patients. 
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SOME FACTORS IN SELF-JUDGMENTS 
EVELYN P. MASON 


Washington University School of Medicine 


INTRODUCTION 

Problems confronting the aged have been discussed since the time of our earliest 
philosophers. Because of this concer many psychological studies have been re- 
ported dealing with the decline in mental abilities; others have dealt with levels of 
adjustment. In the latter area Kuhlen “? has noted that the findings vary from study 
to study and appear to be controversial. On the one hand, the findings of Albrecht “’, 
Granick “ and Kleemeier “) suggest that good adjustment in old age reflects life long 
patterns of good adjustment. Other studies by Diethelm and Rockwell®?, Doll 
and Fried “ report that in the aging process the individual becomes less able to cope 
with his life situations. Studies supporting the latter point of view suggest that this 
influence is related to the individual’s feeling of self-worth and, hence, to his self- 
concept. A study by Tuckman and Lorge“? on the attitudes of the aged toward the 
older worker supports this hypothesis. These authors found when they interviewed 
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institutionalized and non-institutionalized adults from 60 to 80 years of age that as 
individuals become less able to function they subscribe more and more to the erron- 
eous beliefs and ideas about the older worker. A study of self-judgments, then, 
should provide useful information concerning adjustment patterns in later maturity. 


PROBLEM 


It was hypothesized that living conditions, economic status and age are factors 
significantly related to attitudes of self-worth. To evaluate this, self-judgments of 
three groups were compared. A group of 60 indigent, institutionalized subjects 
above the age of 55 was compared with a group of 30 subjects of middle class status 
above the age of 60, who were still able to maintain an independent existence. This 
latter group was drawn from individuals seen for routine checkup by a physician 
specializing in geriatric medicine. This comparison yielded information concerning 
the individual’s feeling of self-worth when age was held constant and economic status 
and living conditions varied. A third group of 30 young adults of low-economic 
status was studied as a control along the age dimension. This group was composed of 
parents of clinic patients. Their clinic status indicated limited financial background 
and allowed a comparison of this group with the aged institutionalized group. 

Self-judgments were measured by the Self-Concept Questionnaire“ and the 
W. A. Y. technique®). The Self-Concept Questionnaire consists of 26 positive or 
negative statements about the self taken from Fiedler’s Q-sort statements“) and 
from statements which three experienced psychologists considered appropriate for 
the measurement of attitudes of self-worth of an aged population. Adequate judg- 
mental validity and test-retest reliability were established for the positive and 
negative scoring of each statement. The W. A. Y. technique is a projective device 
which requires a subject to respond with three answers to the question, ““Who are 
you?” Although the technique is still in the exploratory phase, normative data are 
available. Each response to the technique was scored by assigning it to the categories 
of name, family, social, occupation, age, sex, group membership, unit, positive affect, 
negative affect and longevity. 


RESULTS 


Statistical comparisons were made of the three group’s responses to the two test 
measures. As responses to the 26 items of the Self-Concept Questionnaire were 
dichotomous, i.e., either positive or negative, a chi-square analysis was done. A 
similar procedure was followed with the categories utilized in the W. A. Y. responses 
since a category either was or was not used. Attention to Table 1 indicates that the 
three groups responded to ten of the 26 items of the Self-Concept Questionnaire in 
a significantly different fashion. These items were: 

I am able to do things as well as most other people. 

I would usually rather be by myself than with other people. 

I enjoy living now as much as | used to. 

I do not feel I have enough energy to do the things I would like. 

Nobody pays much attention to what I do or say. 

My day is filled with useful activities. 

I am unhappy much of the time. 

I find I am the type of person who feels as close to my family now as I ever have. 
I worry about physical pain and suffering. 

I am hardly ever very excited or thrilled. 


With items 5, 12, 14, 19, and 23, there was no significant difference in the re- 
ports of the two aged groups, but both groups reported significantly fewer positive 
answers than did the group of young adults. Therefore, the two aged groups feel 
they are unable to do things as well as most other people, they feel they do not enjoy 
living as much as they used to, they do not feel they have enough energy to do the 
things they would like to do, they do not feel their days are filled with useful activi- 
ties, and they are inclined to worry more than the group of young adults about 
physical pain and suffering. 
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TaBLe 1. Cui-Square Test oF SIGNIFICANCE OF DIFFERENCE BETWEEN NUMBER OF POSITIVE 
RESPONSES TO ITEMS OF SELF-CONCEPT QUESTIONNAIRE Mabe By Group I, 60 Acep, Low-EcoNomIc, 
INSTITUTIONALIZED SuBJects; Group II, 30 Acgep, HiaH-EcoNomic, NON-INSTITUTIONALIZED SuB- 
JECTS; AND Group ITT, 30 Youna, Low-Economic, NON-INSTITUTIONALIZED SUBJECTS. 








Test Group I | Group IT | Group ITT | Chi-square Chi-square | Chi-square | Chi-square 
Item (N =30) (N =30) | For I, II For I, II For I, III | For II, II 
Number Ill 





22 
8.02** 


CONM CrP Wbo— 


.16" 
12 
13.78** 
4.50** 


2.04 
3.98* 





Total 21.24** 





























*Significant at .05 level of confidence. 
**Significant at .01 level of confidence. 


The aged institutionalized individuals differed from the young adults on all ten 
items, reporting significantly fewer positive responses. However, there was no sig- 
nificant difference in the number of positive responses made by the young adult and 
the aged independent groups to items 6, 18, 20, 21, and 24. These two groups, then, 
feel less that they would rather be by themselves, they feel less that no one pays 
much attention to what they do or say, they report fewer feelings of being unhappy, 
they feel closer to their families at the present time, and they are more frequently 
thrilled and excited over happenings than is the aged institutionalized group. 

Analysis of the responses from the projective device produced additional in- 
formation concerning the view of “‘self” of the three groups. Reference to Table 2 
shows that of the 11 categories of the W. A. Y. technique five are differentially used 
by the three groups. 

Name is a category which occurred significantly more frequently in the responses of 
the aged institutionalized group than in the responses of the other two groups. Occu- 
pational interests, on the other hand, were more frequently seen in the responses of 
the aged independent and the young adult groups than in the responses of the aged 
institutionalized. Of particular significance was the evidence that no difference oc- 
curred in the frequency with which the two aged groups used the family category, 
while this category occurred significantly more frequently in the responses of the 
young adult group. Furthermore, no difference occurred in the two aged group’s 
use of the categories of negative affect and longevity, and these categories occurred 
significantly more frequently in the responses of the two aged groups than in the 
responses of the young adult group. Positive affect, on the other hand, was displayed 
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TANLE 2. Cui-sQuaRE TEST OF SIGNIFICANCE OF DIFFERENCES BETWEEN CATEGORIES REPORTED ON 
THE W. A. Y. TEcHNIQUE By Group I, 60 AcEp, Low-Economic INSTITUTIONALIZED SUBJECTS; 
Grovp II, 30 AcEp, HigH-EcoNoMIc, INDEPENDENT SuBJEcTS; AND Group III, 30 Youne, Low- 
ECONOMIC, NON-INSTITUTIONALIZED SUBJECTS. 








Group I Group II | Group III | Chi-square} Chi-square} Chi-square} Chi-square 
Category N =60 N=30 N=30 | ForI, II, | ForI, II | For I, Ill — 
Ill 





Name 37 13 8.59* 8.01** 2.60 

Family 16 16 .36** 2.66 16.36** 
Social ae 1. 
Occupation 5 7 : 2 11.64** 
Age 
Sex 


Group Mem- 
bership 


Unit 

Positive 
Affect 

Negative 
Affect 








Longevity 11 10 1 























*Significant at the .05 level of confidence. 
**Significant at the .01 level of confidence. 





more frequently in the responses of the aged independent group than in the other 
two groups. 

Inspection of the type of answers and the actual responses reported by the 
three groups to the question ‘‘Who are you’’? suggested further qualitative differ- 
ences which were somewhat obscured by the more objective analysis. Whereas the 
young adult group typically reported activities reflecting their role in life such as, 
“Tama mother,” “I am a husband, a father, a farmer, a miner,” the aged independ- 
ent group’s responses reflected a greater concern with the present emotional tone of 
their life situation. Typical responses of this group were: “‘I like to do things.” “I like 
to help people if I can help them.” “I am a man who has accomplished many things 
in life.” “I am happy to be here.” “I am a no count loafer.” “I don’t add much to 
the world and I don’t take much away.” ‘“‘I am a middle aged man.” “‘T used to hunt 
a good deal but I can’t anymore.” “‘I am a former water commissioner.” ‘I am some- 
one who hasn’t made the most of my opportunities.” 

Even though the aged institutionalized group responded more frequently with 
the categories of name and group membership (i.e., ‘I am a Christian.” ‘I am a 
German’’), a similar concern with emotionality was seen in their responses as that 
evidenced by the aged independent group. Examples of these were: “I am a happy 
go lucky.” ‘Might say I had as good a home as I could expect—lucky to be inside.” 
“T’m lucky to be alive.” “I’m 79 years old”’. ‘I’ve seen better days.”’ “I’m nobody.” 
“T had a good husband and a happy life.” “I’m a man with a misspent life.” “I made 
re pa of my life.”’ ““Wish it was all over.” “I’m just a dependent.” “I’m getting 
old. 

Discussion 


The analysis of the data obtained from self-judgments on the Self-Concept 
Questionnaire shows that the view of self-worth of the three groups differs significant- 
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ly. The aged institutionalized group, in oregon. views self-worth more negatively 
than the aged independent group, and this group in turn is more negative in its 
view than are the young adults. Nevertheless, the two aged groups, despite widely 
different living conditions and economic status, report similar negative views con- 
cerning present state of happiness and ability to contribute. This finding suggests 
that old age is a factor which is significantly related to some negative self-judgments. 
Since, however, the aged, high-economic, independent group and the young, low- 
economic group hold similar more positive views than the aged institutionalized 
group concerning skills in social interaction, there is evidence that economic status 
and present living conditions compensate to a considerable degree in these negative 
self-judgments. 

These findings were further elaborated on by the analysis of the differential 
responses of the three groups to the W. A. Y. technique. Associations related to role 
in life occur more frequently in the young adult group, somewhat less in the aged 
independent group, and relatively infrequently in the‘ aged institutionalized group. 
However, the responses of both aged groups reflected more concern with the affective 
aspects of their life situations and longevity than was noted in the group of young 
adults. Since the young adults are more actively engaged in everyday living their 
first associations about themselves seem to be related to their roles in life. On the 
other hand, older individuals who are viewing life more from the sidelines, seem to 
be more concerned with the affective tone of their life situation. 


SUMMARY AND CONCLUSIONS 


To evaluate the relationship of living conditions, economic status and age to 
self-judgments three groups were studied. An aged, indigent, institutionalized group 
was compared to an aged independent group of middle class status. As a control 
along the age dimension, a non-institutionalized group of young adults of low-econ- 
omic status was studied. Self-judgments were measured by the Self-Concept Quest- 
ionnaire and the W. A. Y. technique. 

Results showed that the aged institutionalized group views self-worth more 
negatively than the aged independent group, and this group is more negative in its 
views than the young adult group. The two aged groups hold similar, more negative 
attitudes toward present state of happiness and ability to contribute than the young 
adult group. No significant difference was found between the aged independent and 
the young adult group’s attitude toward social competence. While the categories 
of response to the W. A. Y. technique most frequently reported by the young adult 
group reflected their role in life, the two aged groups most frequently gave res- 
ponses in the categories of affect and longevity. 
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THE RELATIONSHIP OF CERTAIN PERSONALITY FACTORS TO 
PROGNOSIS IN PSYCHOTHERAPY! 


SELIG ROSENBERG 
Veterans Administration, Brooklyn Regional Office, Brooklyn, N. Y. 


PROBLEM 


As indicated by the comprehensive review of Windle, there is no paucity of 
prognostic studies attempting to relate psychological test performance to eventual 
outcome in psychotherapy. He concludes, however, that the results have been 
generally unsatisfactory and with little agreement among findings. Moreover, few 
studies have been made emphasizing the personality structure itself rather than the 
test protocol. The preponderance of research in this area derives its conclusions 
from test signs and scores and focuses on the quantitative aspects of the tests. This 
investigation stresses the evaluation of personality characteristics and uses psycho- 
logical tests as tools to reflect the personality picture of the patient and not as ends 
in themselves. Furthermore, the results are cross validated on an equal sample de- 
rived from the same population. 


SUBJECTS 


The subjects for the study were a sample of forty white male veterans of World 
War II, aged twenty-five to thirty-five, who had been diagnosed as psychoneurotic 
by two staff psychiatrists and who were receiving individual psychotherapy treat- 
ment at the Mental Hygiene Clinic of the Veterans Administration, Brooklyn Re- 
gional Office. This sample consisted of two equal groups. One group was composed 
of definitely improved cases; the other group was composed of definitely unimproved 
cases. 

The subjects were obtained in the following manner: Each of ten psychiatrists 
was asked to submit two cases who had been seen for a period of nine months and 
who were, in the therapist’s opinion, definitely improved. Two other cases were 
requested who were also seen for a similar period and who were considered definitely 
unimproved. To supplement this admittedly subjective determination, the follow- 
ing criteria, a modification of a suggestion by Hunt“, were used: 

Positive changes in adaptive ability or efficiency including: Changed ability to get along 
with others, changed efficiency at school, changed efficiency on the job, new skills of any sort. 
Positive changes in disabling habits and conditions including: Changes in level of anxiety, 


modification of other presenting symptoms, changes in basic conflicts, changes in personality 
traits. 


Positive changes in attitude or understanding as evidenced from the patient’s verbaliza- 
tions including: Changes in attitude toward the self, discernment of relationships between 
present behavior and events in the patient’s personal past. 


A patient who was considered improved by the psychotherapist, and who fell into 
any of the above categories as determined by specific evidence in the case history, 
was accepted as an improved patient. The patient who was judged unimproved by 
the therapist and who was unable to satisfy any of the above criteria was considered 
an unimproved case. The total experimental population of 40, consisting of 20 im- 
proved and 20 unimproved cases, was culled from approximately 400 cases and 
represented a careful selection by the therapists to meet the criteria of improvement 
and unimprovement. 

These groups of improved and unimproved cases were randomly divided into 
two improved groups of ten subjects each and two unimproved groups of ten sub- 
jects each. One of the two improved groups was called Improved Group “‘A’’; the 


1This is adapted from a dissertation submitted to The School of Education of New York Uni- 
a. — acknowledgment is due to Dr. Brian E. Tomlinson and Dr. Bernard Kalinkowitz for 
their counsel. 
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other was called Improved Group “B’’. The unimproved groups were known as 
Unimproved Group “A” and Unimproved Group ‘“‘B’’. The “A” groups were util- 
ized as the experimental population from which were derived the personality var- 
iables significantly related to prognosis in psychotherapy. The “B” groups were 


utilized for testing the validity of the conclusions reached from the investigation of 
the “A” groups. 


PROCEDURE I 


The Wechsler-Bellevue Intelligence Scale (Form I), the Rorschach, and a 
Sentence-Completion technique had been administered routinely by staff psycholo- 
gists to all patients starting individual psychotherapy sessions. The test protocols 
for all the experimental population were culled from the files and utilized as follows: 
The Wechsler-Bellevue Test Scores for the Improved Group “A” were compared 
with the test scores of the Unimproved Group ‘“‘A”’’, and a “‘t”’ test was used to eval- 
uate the significance of the obtained difference. 

The Rorschach and Sentence Completion test protocols, representing cases 
from the ‘“‘A”’ group, with the improved or unimproved status not indicated, were 
submitted to two staff psychologists who made independent qualitative judgments 
which they translated to rating scales prepared by the investigator far 23 person- 
ality factors. Of these 23 factors, 13 were couched in Rorschach terminology and 
were adapted from Morris and Nicholas“). The scales measuring these 13 factors 
included 5 intellectual variables and 8 affective variables, and consisted of 6 steps 
with ratings from zero to 5. The remaining 10 of the 23 variables were listed under 
Attitudes and Traits. These Attitudes and Traits were presented in the form of a 
dichotomy of positive and negative, and above-average and below-average traits. 
A description of the 23 variables is presented below: 


Intellectual Variables 


1. The productivity of associations, that is, the facility with which the subject is able to 
associate. 

2. Concreteness, i. e., the emphasis on thinking which is predominantly concrete, the tend- 
ency to attend to the more obvious details of a problem, the ability to meet practical problems. 
3. Rigidity of intellectual control, i. e., constriction in the intellectual-control pattern result- 
ing in the narrowing of emotional life. Behavior is built around form rather than substance. 
Over-correct and over-conventional procedures are used as defenses. Emphasis is placed on 
intellectual factors and away from emotionality. 


4. The extent of stereotypy in thinking, the range of interests, the presence or absence of 
many or varied interests. 


5. Efficiency of intellectual functioning, i.e., the ability to make use of inner resources and 
capacities, the ratio of organizational ability to creative capacity. 

Affective Variables 
1. 


Personality type, i. e., the balance between extratensive and introversive tendencies, 
those tendencies distinguishing between people who are predominantly conscious from within 
(introverts) and people who are predominantly stimulated from without (extraverts). 


2. Inner maturity vs. immaturity, i. e., the level of immaturity in the fantasy life, the range 
between the mature side of development of the inner life of fantasy and creativity and an 
infantile level of psychic development. 

3. Emotional balance, i. e., the balance between controlled and more uncontrolled emotion- 
ality; the kind of responsiveness to emotionally toned stimuli emanating from the environment. 
4. Emotional depth, i. e., depth of the emotional life based upon the amount of affective 
participation in res nding to affectively toned stimuli. 

5. Disturbance. “This refers to disturbances in emotional responsiveness, the extent to which 
a neurotic emotional disturbance reaction is reflected. 


Mood quality. The quality of the mood life on a continuum from extreme dysphoria to 
extreme euphoria. 


7. Sensitivity. Sensitivity and tact in reacting to emotional stimuli. 
8. Conflict. The amount of conflict caused by the frustration of drives. 
Attitudes 


1. Attitude toward mother. The presence of such feelings as love, devotion, affection, trust, 
good will, and such feelings as hostility, resentment, criticism, distrust. 
2. Attitude toward father. The presence of such feelings as love, devotion, affection, trust, 
good will, and such feelings as hostility, resentment, criticism, distrust. 
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3. Attitude toward women. The presence of such feelings as good will, trust, affection, love 
or such feelings as hostility, resentment, criticism, distrust. 
4. Attitude toward authority. The presence of such feelings as acceptance, respect, trust, 


good will, or such feelings as criticism, defiance, resentment, hostility. 
{4 


5. Attitude toward the future. Tendency to be optimistic or pessimistic. 


6. Attitude toward the past. Generally pleasant, happy feeling or generally unpleasant, un- 
happy feeling. 


7. Attitude toward the self. The feeling of being worth while and adequate, or worthless 
and inadequate. 
Traits 


1. Energy level. The tendency to achieve, to try to overcome obstacles, to make efforts to 
accomplish. 


2. Dependency needs. The unwillingness to accept responsibility, the need for protection and 
succorance, the need to be supported. 


3. Health concern. The preoccupation with bodily symptoms and physical complaints. 


The ratings from the two judges for the 23 personality variables were obtained 
in this fashion for ten improved cases (improved Group A) and ten unimproved 
cases (unimproved Group A). Treatment of these data involved an evaluation of 
the reliability of these scales and an analysis of the extent to which any of the var- 
iables differed between the improved and unimproved groups. To measure the re- 
liability of the thirteen rating scales, the coefficients of correlation between the 
judges’ ratings were computed by the product moment correlation index, r. To 
measure the difference between the groups of variables judged by the rating scales, 
a ‘t” test was utilized. 

The data from the ratings of the two judges on the ten dichotomy scales were 
treated by a chi-square technique with a correction for small samples to determine 
the reliability of the scales and the extent to which the attitudes and traits differed 
in the improved and unimproved groups. 


Resutts I 


Analysis of the results revealed a number of factors among the intellectual 
variables, affective variables, and attitudes and traits which were significantly differ- 
ent in the pre-treatment personality of a subsequently improved group (Improved 
Group A) and a subsequently unimproved group (Unimproved Group A). Among 
the intellectual variables, the evidence indicated the following: 


1. The IQ’s of the improved group were on a superior level while the mean IQ’s 
of the unimproved group were on an average level. This difference was sig- 
nificant at the .01 level of confidence. 

. Productivity was significantly higher in the improved group. The unimprov- 
ed groups tended toward below average productivity. The difference was 
significant at the .05 level of confidence. 

. Rigidity was significantly less in the improved group. The unimproved 
group showed greater than average rigidity. The difference was significant 
at the .01 level. 

. Stereotypy was significantly less in the improved group. The unimproved 


group showed greater than average stereotypy. The difference was sig- 
nificant at the .01 level. 


The affective variables produced two significant factors: 


1. The difference in emotional depth was significant at the .01 level of con- 
fidence. The improved group reflected the capacity for deeper than average 
feelings while the unimproved group showed more shallow feelings than 
average. 

. Sensitivity was significantly higher in the improved group. The unimproved 


group showed less than average sensitivity. The difference was significant 
at the .01 level. 
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Among the group of attitudes and traits, the following were found to be significant: 


1. Energy level was higher in the improved group. The unimproved group has 
a lower than average energy level. The difference was significant at the .05 
level. 

2. Health concerns were fewer with the improved group. The unimproved 
group showed a greater than average bodily preoccupation. The difference 
between the two groups was significant at the .05 level. 


ProcepureE II 


The B groups were now utilized to determine to what extent the results de- 
rived from the A groups could be validated. 

The two raters were informed of the specific personality variables which had 
been found to be significantly related to psychotherapeutic improvement in the A 
groups. They were then asked to use these findings to make predictions of improve- 
ment or unimprovement from the protocols of the twenty cases in the B groups. 
(The given variables were assigned arbitrary weights by the investigator on the 
basis of their level of significance, with a 17% level receiving a weight of 2 and a 5% 
level receiving a weight of 1.) The protocols were evaluated qualitatively and the 
extent of each prognostic variable was rated. The judges then considered the total 
extent of the variables related to improvement or unimprovement and, on that 
basis, made their predictions. 


Resutts II 


The results of the judgments in the B groups showed that Judge I was able to 
make sixteen correct predictions out of twenty cases. Judge II was able to make 
fifteen correct predictions out of twenty cases. These results were statistically sig- 
nificant at better than a 2% level of confidence. It was concluded that the list of 
significant variables obtained from the data on the A group was an effective device 


for making predictions of therapeutic improvement on the B group. 


Discussion 


Utilizing the results of this study, composite pictures may be arrived at which 
will portray two hypothetical patients: one possessing all those personality factors 
which are indicative of a favorable prognosis in psychotherapy and one possessing 
all those personality factors which point to an unfavorable prognosis. It is stressed, 
however, that in reality most patients will have both favorable and unfavorable 
factors and the prognostic decision would be arrived at after evaluating their res- 
pective weights. 

The patient who is most likely to improve in therapy has a superior intelligence 
and the ability to produce associations easily. He is not rigid and has a wide range of 
interests. He is able to feel deeply and is sensitive to his environment. In addition, 
he exhibits a high energy level and is relatively free from concern with bodily symp- 
toms. 

The patient who is least likely to improve in therapy has an intelligence no 
higher than average and does not easily produce associations. He tends to be rigid 
and to have a narrow range of interests. He does not feel deeply and has a less than 
average sensitivity to his environment. His energy level is low and he tends to be 
overconcerned with bodily symptoms. 


CONCLUSIONS 


Bearing in mind the limitations of the experimental group which is restricted to 
a small white veteran sample with a diagnosis of psychoneurosis, and an age range 
of 25 to 35, the following conclusions seem justified: 


1. Certain personality factors are definitely associated with the ability of neu- 
rotic patients to improve in therapy. 
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2. Utilizing these personality factors as a guide, it is possible for the exper- 
ienced clinician to predict the course of therapy with considerable accuracy from 
pre-treatment psychological tests. 
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ETHNOCENTRIC ATTITUDE CHANGES AND RATED IMPROVEMENT 
IN HOSPITALIZED PSYCHIATRIC PATIENTS! 


EPHRAIM ROSEN 
University of Minnesota 


INTRODUCTION 


One of the basic hypotheses of many conceptualizations of ethnocentrism can 
be expressed as follows: Ethnocentric ideology is in large part a projection of other- 
wise unsolved intrapsychic conflicts. A simple prediction stemming from this hypoth- 
esis is that there should be a positive relationship between decrease in ethnocentric 
ideology and improvement in mentally ill patients, on the assumption that improve- 
ment indicates a decrease in severity of personality conflicts. This relationship 
should hold whether the improvement is a function of psychotherapy, shock therapy, 
or no formal therapy at all, for the particular source of conflict alleviation, if there is 
such alleviation, would seem to be irrelevant to the prediction. 

Very little has so far been done to test this prediction. Levinson“), on the basis 
of a study of psychiatric clinic patients, concluded that high scorers on ethnocentrism 
evidenced a personality syndrome which would be resistant to psychotherapeutic 
change by virtue of their rigidity, constriction, conventionalized thinking, and weak 
interpersonal relations. Indirectly this conclusion would seem to imply that patients 
who improve, insofar as the improvement involves decrease of rigidity, constriction 
etc., should also show decreased ethnocentrism. Levinson also found that her 
patient sample was on the average relatively unprejudiced, that there was no clear 
relationship of degree of ethnocentrism to diagnostic category, but that there was a 
positive relation between ethnocentrism and overall poorness of adjustment as 
Judged by elevation of Minnesota Multiphasic Personality Inventory (MMPI) 
profiles. 

Negative evidence relevant to the present study comes from Carpenter’s®? 
work. He found a zero correlation between ethnocentrism scores and the difference 
between pre- and post-shock MMPI scores in a group of psychotic patients treated 
by EST. A zero relation also obtained between ethnocentrism and the difference 


1This research was supported by a grant from the Graduate School of the University of Minne- 


sota. 
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between pre- and post-shock clinical ratings. No study was made of ethnocentric 
change itself, but as a predictor of adjustmental change ideology seemed irrelevant. 

On the other hand Barron), working with neurotic outpatients treated by 
relatively brief psychotherapy, found that the E (ethnocentrism) scale“? correlated 
—.64 with ratings of improvement. The scale was a better predictor than the MMPI, 
Wechsler, or Rorschach. When intelligence was partialled out the correlation 
dropped but was still significant at -.34. Again, ethnocentric change as such was 
not studied, but one implication of the study would seem to be that personality con- 
flict is more readily alleviated when it is not associated with a strong ethnocentric 
ideology. 


METHOD 


The present study investigated the relation between the variables of ethnocen- 
trism and improvement in psychiatric in-patients. Thirty-four male and 70 female 
consecutive admissions to the Psychiatric Division of the University of Minnesota 
Hospitals were given a test of ethnocentrism on admission and again immediately 
prior to discharge. As soon as possible after discharge the psychiatric resident in 
charge of the particular case rated improvement on “underlying personality diffi- 
culties’ on a four point scale of worse, no change, slightly improved, and much im- 


Tasie 1. Acs, InTeRTEsT INTERVAL, TYPE OF TREATMENT, Hours oF PsyCHOTHERAPY, PRIMARY 
DIAGNOSIS, AND RATED IMPROVEMENT 








| Male Female ! 
Factors \(N =34) (N =70) 





Age: Mean |} 41.2 35.8 |i Male Female 
Range | 18-73 17-69 || Primary diagnosis (N =34) (N =70) 





Intertest interval in 
days: Mean | 99 38 
Range | 2-105 4-95 | Reactive depression 10 


Schizophrenia 7 14 





Type of treatment || Mixed psychoneurosis 10 
None Organic psychosis 5 
EST only Psychopathic deviate 


EST plus | Involutional 
psychotherapy 


IST plus 
psychotherapy 4 Paranoid reaction 


Conversion hysteria 


Psychotherapy only é 18 Manic-depressive 


psychosis 
Mean hours 9.0 11.5 


Range in hours 4-24 3-55 Other 

















Underlying Personality Difficulties Symptomatic Picture 
Rated improvement 


(in per cent of total) Male Female Male Female 


Worse 0.0 1.4 0.0 1.4 





No change 64.7 45.7 26.5 18.6 
Slightly improved 26.5 44.3 44.1 34.3 
Much improved 8.8 8.6 29.4 45.7 
Total 100.0 
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proved. Similar ratings were made for “symptomatic picture” change, and informa- 
tion was gathered on type of therapy and amount of therapy. The design, from the 
point of view of experimental investigation of theoretical concepts, is crude in that 
patients were admitted to the sample regardless of age, intelligence, diagnosis, type 
of therapy, or any other variable. The crudeness was intentional, however, for the 
study was oriented to the clinical question ‘“‘What happens to the ethnocentric level 
of a broad sample of patients as a function of hospital stay?” as well as to the theoret- 
ical question ‘‘How are ethnocentric change and psychiatric improvement related?” 

Table 1 summarizes, for the 104 subjects of the study, age, time interval be- 
tween the two administrations of the ethnocentrism test, type of treatment, hours 
of psychotherapy, primary diagnosis, and rated improvement. The considerable 
variance on all these measures should be noted. Since rated improvement is one of 
the two major variables of the study it is important to note that females either 
actually improved more than males or were rated more generously. The difference 
between ratings of personality change of males and females is not quite significant at 
the 5 per cent level on a Chi-square test. For symptomatic improvement the direc- 
tion is the same but nowhere near significance. These differences are particularly 
striking since, as might be expected, personality difficulty improvement is rated less 
generously for the total sample than symptomatic improvement. 

The measure of ethnocentrism used for test and retest consisted of 54 questions 
to be answered on a scale of agree very much, agree pretty much, agree a little, and sim- 
ilarly for three degrees of disagreement. No neutral response was allowed. Responses 
were scored from minus three to plus three. The items were selected from the scales 
used by Adorno et al“ in the California study of authoritarian personality. Twenty- 
nine of the items were taken from the F (pre-Fascist) scale, form 45%: Pp. 255-257); 20 
from the E (ethnocentrism) scale“); and five from the PEC (politico-economic con- 
servatism) scale, form 45“: ». 1), Total test score was computed as the algebraic 
sum of individual item scores. 


RESULTS 


Table 2 presents pre- and post-hospitalization ethnocentrism scores of the 
sample. Both male and female patients became more ethnocentric during hospital- 
ization. In both test and retest males were less ethnocentric than females, though 
not significantly so. The absolute level of ethnocentrism on both test and retest 
for both sexes was slightly toward the neutral point in terms of the scoring system 
used, so that the group may be considered, in terms of the test, neither very ethno- 
centric nor the reverse. Variability was enormous, however. The high correlation of 
test and retest is also to be noted. In summary, then, the patients showed a wide 
range of ethnocentrism, changed slightly in the direction of more ethnocentrism 
during hospital stay, but maintained their individual levels as compared to the rest 
of the group. 


TABLE 2. ETHNOCENTRISM SCORES OF PATIENTS STUDIED 








Male Female Total 
Scores N =34 N=70 N=104 





Pre-hospitalization M 2.9 18.2 13.2 
o 53.4 56.8 60.1 


Post-hospitalization M 10.3 22.8 18. 
o 60.6 63.2 56.3 


Difference +7.4 +4.6 +5.6 
Significance of difference* <.05 <.15 <.05 





r of pre- and post-tests +.93 +.92 +.92 











*Significance of difference was calculated by the ¢ test for correlated measures. 
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Before concluding that the shift in the direction of greater ethnocentrism was 
actually a function of hospitalization it must be shown that this shift does not occur 
with time alone. Consequently a control group of 17 males and 15 females, ranging 
in age from 19 to 51 with a mean at 29, were tested seven weeks apart. The control 
subjects were extension division university students whose absolute level of ethno- 
centrism was considerably lower than that of the patients. As with the patient 
group there was a small but non-significant shift toward greater ethnocentrism in 
males, but in females there was none. Reliability of the test was as high as for the 
hospital group. It seems likely therefore that mere retesting with the scale here 
used produces some tendency to higher scores, so that the patient group can not be 
said to have become more ethnocentric due to hospitalization as such. In the light 
of the fact that over half of the patients were rated as having improved it can be 
concluded that such ratings can be associated with no decrease in ethnocentrism, 
but only very tentatively can it be inferred that such ratings can be associated with 
an actual slight increase in ethnocentrism. 

It would still be possible, however, for patients who improved most to show a 
decrease in ethnocentrism. Analysis of the data indicated that this did not happen. 
The difference between first and second tests was not related significantly to rated 
improvement in either symptoms or personality, for either sex. All correlations of 
this type were close to zero, most of them slightly positive, contrary to the theoreti- 
cal prediction of significant negative correlations. Nor was there a relation between 
ethnocentric change and hours of psychotherapy. The major hypothesis of the study 
was thus answered in the negative, although only for a hospital population of the 
type here studied. 

In view of the work of Levinson, Barron, and Carpenter referred to earlier, 
analysis was also made of relation between first test and second test on the one hand, 
each taken alone, and rated improvement. Again, there were no significant correla- 
tions for either sex. Of the data gathered, the only variable which showed a signifi- 
cant relationship to ethnocentrism score was age. Both first and second tests cor- 
related with age, for each sex, at a .01 level of significance, the smallest of the four 
correlations being +.44. Change in the ethnocentric direction was also associated 
with greater age, but not to a significant degree. In the control group, age and ethno- 
centrism were not related. 

Analysis of differential ethnocentric change for EST and non-EST treated 
patients showed increased ethnocentrism for both groups. Among females, use of 
EST was associated with significantly greater increase of ethnocentrism than non- 
use, but the direction was reversed among males, so that no statement can be made 
as such about the relation of EST to ethnocentric change. 

The last type of analysis done was of diagnosis and ethnocentric scores (Table 3). 
Analysis was restricted to diagnostic groups having an N of at least 10. None of the 
male diagnostic sub-groups met this criterion and hence they were omitted from 
analysis. The groups were too small and variable for statistical analysis, but taking 
the results very tentatively at face value they are surprising: Schizophrenia, a 
“deeper” personality disturbance than depression or mixed neurosis, shows less 


TaBie 3. DIAGNOSIS AND ETHNOCENTRISM (FEMALE PATIENTS) 








Pre-hospitalization Post-hospitalization 
Groups N score score Difference 





Schizophrenia 14 -17.4 -11.5 +5.9 
Reactive depression 10 +23.7 +36.0 +12.3 
Mixed psychoneurosis 10 +12.3 +28.2 +15.9 
Psychopathic deviate 10 +23.7 +11.5 -12.2 
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ethnocentrism; and all groups but psychopathy, which is notoriously a category 
evidencing little if any basic change during hospitalization, show increased ethno- 
centrism. The psychopaths alone tended toward decreased ethnocentrism. One is 
tempted to conclude that the attitude change in the psychopaths was not genuine 
at all but was merely a function of their belief that such an apparent change of atti- 
tude was the proper thing to evidence. 


DISCUSSION 


Although the major analysis of this study showed no relation between ethno- 
centric level or change and improvement, there were indirect suggestions of a reverse 
effect to the theoretical prediction. The minor shift of the total group toward greater 
ethnocentrism, the fact that female patients (unlike the controls) scored higher than 
males and were also rated as improving more than males, and the surprising scores 
~ schizophrenic and psychopathic patients all tentatively tend to point in this 

irection. 

The writer feels that these results can best be explained on the assumption that 
the improvement noted in the patients was not in the direction of reduction of con- 
flicts but rather in the direction of better control of unchanged conflicts. The hours 
of psychotherapy for those patients who received it averaged only nine for males 
and 11.5 for females, while a considerable group received no psychotherapy at all. 
It may safely be presumed that such a regimen conduces to better control of diffi- 
culties rather than to personality reorganization such as is attempted, for example, 
by psychoanalysis. Generally speaking, it is known that for most seriously disturbed 
hospital patients who improve, the improvement is in the direction of social recov- 
ery. For many patients social recovery entails a bolstering of defenses rather than a 
dissolution of conflicts defended against. Even though one of the two sets of ratings 
used was improvement in “underlying personality difficulties” rated improvement 
may have indicated control rather than reduction of difficulty. Ethnocentrism, in- 
sofar as it is a projective defense socially sanctioned by a sizeable part of the general 
population, may thus aid in controlling conflicts for many patients and hence may 
tend to increase with certain types of psychiatric improvement. 

If this interpretation has validity then ethnocentrism should decrease with 
adjustmental improvement only when therapy is of the “uncovering” type and when 
measures of adjustment specifically exclude better control of conflict as an indicator 
of personality improvement. The present study did not attempt this type of an- 
alysis. 


SUMMARY 


1. One hundred and four hospitalized psychiatric patients were tested for 
ethnocentrism on admission and again on discharge. They were also rated by the 
psychiatrist in charge of each case on a scale of improvement in underlying person- 
ality difficulties and on a scale of improvement in the symptomatic picture. 

2. Analysis of relation between change in ethnocentrism and rating of improve- 
ment was performed to test the prediction that ethnocentrism should decrease with 
improvement. The prediction was not borne out. Similarly the ethnocentric level 
at admission and at discharge, respectively, were unrelated to improvement. 

3. Analysis of other aspects of the data led to the conclusion that there was a 
slight tendency for ethnocentrism to increase with hospitalization, and in all likeli- 
hood to increase with improvement. This conclusion was interpreted as being con- 
sistent with the fact that improvement in seriously disturbed hospital patients fre- 
quently is an indicator not of dissolution of personality conflicts but of a more 
efficient defensive handling of conflicts. 


4. It was also found that in the hospital sample used ethnocentrism varied 
very widely, was on the average not strikingly high in an absolute sense, correlated 
to a greater extent with age than with any other variable studied, and could be 
measured with high reliability. 
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THE EFFECTS OF ATTITUDES TOWARDS AUTHORITY ON 
PSYCHOTHERAPY! 


RICHARD H. DANA 
Minnesota Bureau for Psychological Services 


PROBLEM 
The confirmation or denial of psychoanalytic assumptions involved in content 
analysis of the Rorschach is a current challenge to research oriented clinicians. That 
Card IV is a significant representation of “father” or “authority” is becoming em- 
pirically evident °: *». However, there is only one study which accepts the symbolic 
meaning of Card IV and proceeds to test other hypotheses®. The testing of specific 
clinical hypotheses, using results of empirical research on psychoanalytic tenets, is 


one appropriate function of clinical science today. 

Authority, in our culture, is a prerogative of both individuals and institutions. 
Crucial to the successful assumption of adult social roles is integration of the interior- 
ized dictates of the various authorities impinging upon the child, i.e., superego de- 
velopment. In the process of socialization many attitudes towards the conditions 
and agents of learning emerge in those on whom the culture is being imposed. Not 
the least formidable of these attitudes is the feeling towards authority, authority 
figures and authority symbols. This feeling is compounded of hostility and love and 
often expressed in ambivalent relations with authority figures. 

Psychotherapy, as a learning situation, has these already established attitudes 
towards authority brought into the relationship between therapist and patient. 
The present study is concerned with the relation between attitudes towards auth- 
ority and progress in short term and long term psychotherapy. These attitudes 
towards authority were judged from responses given to Card IV of the Rorschach. 

The following hypotheses are made: (1) “adequate” attitudes towards authority 
constitute one criterion of relatively good prognosis for both short term and long term 
psychotherapy; (2) “inadequate” attitudes towards authority constitute one cri- 
terion of relatively poor prognosis for both short term and long term psychotherapy; 
(3) “negative” or hostile attitudes towards authority constitute one criterion of 
relatively poor prognosis for short term psychotherapy and relatively good prognosis 
for long term psychotherapy. 


METHOD 


_ , The sample studied consisted of ninety patients at the Danville Veterans Ad- 
ministration Hospital from 1949 through 1951 who produced one or more responses 


*The author wishes to express his appreciation to Dr. K. C. Jost, Chief Clinical Psychologist, and 


3 the ~ psychology department of the V. A. Hospital, Danville, Illinois for their cooperation in 
this study 
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to Card IV of the Rorschach and subsequently received individual psychotherapy. 
The short term group included 44 patients who received from 6 to 19 sessions (mean 
-12). The long term group included 46 patients who received 20 or more sessions 
(mean-51; range-20 to 153). 

Card IV Rorschach responses were separated into three categories, ‘‘ade- 
quate’”’, ‘“‘inadequate’”’, “‘negative’’, by eight experienced clinicians. Criteria used for 
judgment were: (1) “‘adequate’’: the ability to produce a response in which the 
content, considered in regard to what it may mean in relation to the individual’s 
perception of authority, is of neutral tone with good form quality. Examples in- 
clude animal skin, boots, and some human and animal percepts. (2) “inadequate’’: 
the blatantly inadequate response, form poor, reflecting no ability to handle the 
implications of the perception. Examples would include gross distortions, bizarre- 
ness, contaminations, confabulations, mutilations, anatomy and most sex responses. 
(3) ‘negative’’: responses reflecting hostility, adequate with regard to form quality 
but which emphasize largeness of size, threat, and overpowering aspects of the 
percept. Examples would include bull, ape, gorilla, monster, dragon. 

Agreement among judges, due to structuring of criteria, was 88%. The experi- 
menter read the psychotherapist’s summary of the case and judged whether “im- 
provement” or “no improvement” had occurred. The results of this judgmental 
reading of case summaries were not known to those who judged the Card IV res- 
ponses. The psychotherapy summaries followed a fairly standard form and usually 
included an explicit statement of outcome and prognosis. As the summaries function 
merely as information for future therapists who might be working with these same 
patients, there is no pressure for other than accurate, factual records. “Improve- 
ment’’, as a result of psychotherapy, was discerned in 52 cases; ‘no improvement” 
in 38 cases. 


RESULTS 


The data are summarized in Table 1. The three hypotheses were tested by ap- 
plication of chi-square (Table 2). The results for hypothesis one are in the predicted 
direction. ‘‘Adequate”’ responses to Card IV tend to be significantly related to favor- 
able prognosis in both short term and long term treatment. There is no difference in 
prognosis between short term and long term treatment with responses judged 
“adequate’’. 

Hypothesis two was confirmed. Responses judged “inadequate” indicate poor 
prognosis for both short term and long term psychotherapy. An unexpected result 
here is that a significantly greater percentage of patients failed to improve with 
short term treatment than would be expected by chance. Short term psychotherapy 
may actually be detrimental to individuals producing Rorschach Card IV responses 
judged ‘inadequate’. 


TasB.e 1. “ApEQuaTp”, “INADEQUATE”, ‘““NEGATIVE’’, Carp IV RESPONSES 
AND RESULTS OF SHORT AND LonG TERM PsYCHOTHERAPY 








Type of 
Therapy Results of Psychotherapy 

an 
Responses Improved Unimproved 


- -Adequate 14 
LONG Inadequate 5 
Negative 12 








13 
4 


Adequate 
SHORT Inadequate 
Negative 4 





Totals 52 
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Results for hypothesis three are as predicted. There was only chance improve- 
ment with short term treatment for those with responses judged “negative”. Sig- 
nificant improvement occurred with long term treatment, and the difference between 
short term and long term treatment was significant. “Negative” responses to Card 
IV are suggestive of good prognosis with long term psychotherapy. 


TaBLE 2. ComPaRISON oF Carp IV Responses JupGED “ADEQUATE”, 

“TNADEQUATE”’, “NEGATIVE”, FOR IMPROVED AND UNIMPROVED PaTIENTS 

In SHort AND Lona TerRM THERAPY AND BETWEEN SHORT AND LONG 
TrerM THERAPY 








Long term Short term Long vs. Short 


CARD IV Chi Chi Chi 
Responses Square P Square Square P 


Adequate 3.20 > .05 4.76 <.05 .20 —_— 








Inadequate .09 — 8.04 <.01 2.49 — 











Negative 9.30 <.01 0.00 — 4.53 <.05 








It is important to recognize that the size of the sample and the small cell entries 


involved invoke caution with regard to inferences and generalizations from these 
results. 


DIscussION 


It is only by implication that this study deals with the relation of attitudes 
towards authority and psychotherapy. Responses to Rorschach Card IV, which 
previous research has linked to authority, do have predictive value when judged in 
the manner proposed by this study. The present results suggest that Card IV can 
be used alone as a predictive measure. Whether attitudes towards authority per se 
are involved can only be determined by further validation of Card IV as an authority 
symbol or independent validation of attitudes towards authority by other measures. 
Two approaches to this validation are suggested: (1) other measures of attitude 
towards authority could be correlated with results of psychotherapy; (2) correlation 
of an independent measure of attitude towards authority with the present judgments 
of Rorschach Card IV attitudes towards authority. The present study is exploratory 
in nature and provides a test of a clinical assumption. The positive trends in the 
results suggest repetition with a larger N and in different settings. 

The clinician must absorb and often attempt to modify his patient’s attitudes 
towards authority. A persistent problem for psychotherapists is present in those 
patients producing ‘inadequate’ Card IV responses. Their apparent lack of im- 
provement in either short term or long term treatment may reflect relationships 
with authority: in which active, reality oriented reaction of any sort is absent. Of 
course, many patients producing “inadequate” Card IV responses are diagnosed 
psychotic. It would be interesting to have the results of intensive and extremely 
long term treatment (several years duration) with individuals producing responses 
judged ‘‘inadequate’’. 

The objectification of criteria for assignment of patients to psychotherapy can 
clarify initial goals and save the time of both patient and therapist in the course of 


treatment. In order to do this effectively the use of projective materials appears 
both desirable and feasible. 


SUMMARY 


Rorschach Card IV responses of ninety psychotherapy patients were judged 
“adequate”, “inadequate”, “negative” with regard to criteria concerning the per- 
ception of authority. Analysis was made in terms of three hypotheses specifying 
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prognosis for improvement in short term and long term psychotherapy. Results 
suggest that Card IV responses may be used with some success to select patients 
who will respond favorably to either short term or long term psychotherapy. 
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SOME NOTES ON PSYCHOLOGY IN GERMANY, 1953! 
STANLEY B. ZUCKERMAN ‘ 


State of Minnesota, Department of Public Welfare 


Because Germany figured prominently in the development of psychology until 
the “blackout” of free scientific thinking and intellectual inquiry that came with the 
advent of the Nazi era, it is timely to comment on the state of segments of the pro- 
fession nine years after the overthrow of Hitler. These notes have special reference 
to clinical psychology which the writer had occasion to observe from September 
through December 1953 while a U. S. Specialist under the auspices of the Depart- 
ment of State International Educational Exchange Program. Serving in a consulting 
role to a state Social Welfare Ministry in the northwestern (British) zone of Germany, 
the writer had opportunity to make some observations in most of the West German 
Republic (including such key cities as Berlin, Hamburg, Munich, Bremen, Frank- 


furt, Cologne, Essen and Dusseldorf) which may be of interest to psychologists in 
America. 


Tue Status or PsycHoLoGy 


Psychologists are relatively few in number for the present population of Western 
Germany by comparison with the United States. Not only are psychologists a small 
group but they are, moreover, quite limited in influence. This situation stems partly 
from the fact that the German public is by no means as accepting of the profession 
and its practitioners as is the American public. It is hard to avoid the feeling that 
psychology in Germany today does not venture far from the university and does not 
come nearly as close to the life of the average citizen as in America. Even within the 
university, psychology does not have the popularity or recognition as on the campus 
here. Despite important scientific contributions of individual German psychologists 
in the past, psychology still appears to be struggling to establish itself in the uni- 
versities as a discipline independent of philosophy. 


How PsycHo.oGists ARE UTILIZED 


In non-academic positions in the fields of “applied psychology”, not very many 
psychologists are employed though conditions may be somewhat different in the 
southern portion of Germany in which the American influence is greater. Taking 
hospitals as an example, the writer found psychiatrists, some of whom had at least 
been exposed to psychology, doing clinical testing. In the psychiatric ward for 
children in one of the better large general hospitals in the Rhineland, the staff psy- 
chiatrists and nurses—the latter without formal training in psychology—were doing 


‘The courtesies of two other recent U. 8. sng Prof. Gisela Konopka of the University of 


Minnesota and Dean Leo F. Cain of the San Francisco 


tate College, in commenting on the manu- 
script are gratefully acknowledged. 
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clinical testing using measures of intelligence as well as projective techniques. With- 
out questioning their intent or sincerity, it was obvious that these people were 
working at tasks which would more effectively have been done by psychologically 
trained clinicians. One statewide hospital’s psychiatric ward for children was not 
atypical in having a psychiatric staff, but no psychologist. Similarly, the mental 
hospitals employ few psychologists to date. The writer visited a mammoth institu- 
tion which serves more than five thousand mentally ill, defective and welfare cases. 
Though some of the psychiatric staff had psychological training, the one-man 
Psychology Department averred that psychological services are not much more 
plentiful even in smaller hospitals. 

Just as physical damage to the cities is impressive even after rapid reconstruc- 
tion, one cannot be in Germany for any length of time without noticing the large 
number of physically handicapped persons as an aftermath of the war. While a 
relatively high proportion of the population receives some form of benefits for war 
injuries or losses, there is no large separate agency with its network of hospitals and 
out-patient clinics corresponding to our Veterans Administration and employing 
many trained clinical and counseling psychologists. 


Psychological services are conspicuously absent from the German schools with 
the possible exception of the elementary and secondary schools of Bavaria in the 
southern part of the country. For example, in the State of Northrhine Westphalia, 
with a population of 16,000,000 and with some 1,600,000 youngsters in school, not 
one psychologist was employed in the educational systems in his specialty to serve 
this large number of pupils. Notwithstanding, some 25,000 youngsters out of this 
total group have been segregated into schools for the retarded (Hilfsschulen). None 
of these youngsters has been tested by a school psychologist or adequately evaluated 
psychologically by American standards. Cain has observed recently in this con- 
nection that the schools are not fully aware of the potential values and hence are 
“not really interested in psychological services.””? Though teachers are highly trained 
they are generally academically oriented, especially at the secondary school level. 
They seem geared to stressing achievement presumably at the expense of a deeper 
understanding of the individual pupil and his problems. Although large classes and 
over-crowded schools may be slightly extenuating factors, much work remains to 
be done and German school systems need real encouragement to stimulate their 
utilizing psychological services. 


One quite encouraging trend was noted in the field of out-patient clinical ser- 
vices. It was especially heartening to find child guidance clinics in the American 
pattern being established and functioning in the cities of Bremerhaven, Bremen 
and Berlin, among others. The team-principle of collaboration between psychiatrist, 
psychologist and psychiatric social worker appeared to be accepted and applied in 
these settings. As team members, though doing some therapeutic work, psycholo- 
gists were recognized in these clinics for their diagnostic contributions derived from 
both projective and objective techniques. 


Psychologists play a very minor role in the German judicial process. Those 
psychologists who do serve the courts are called in ordinarily on a consulting basis 
from one of the university-affiliated “Psychological Institutes” in the country— 
most often to appraise the reliability of testimony. Judicial personnel have been less 
interested up to now in the offender than in the offense committed; consequently 
psychologists have been looked to less for information on the personality problems 
and motivations of offenders than for an evaluation of the credibility of testimony. 
However, some of the psychologists who work with the courts are trying to do an 
educational job to show what psychological data can contribute towards diagnosis 
and treatment of the delinquent. Moreover, enlightened jurists and penal adminis- 
trators throughout the country are calling for increasing attention to the psycho- 
logical understanding of offenders. 


*Cain, Leo F. Personal communications, March 19, 1954. 
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There are few psychologists serving in their specialty in prisons and correctional 
institutions despite the size of the delinquency problem. Although a couple of psy- 
chologists are prison directors in the Rhineland, there were no more than three 
psychologists doing clinical work in penal settings in the large area not under ex- 
clusive American supervision. Of these, two psychologists serve as members of the 
diagnostic team in the rather progressive correctional system of the city of Hamburg, 
and one psychologist is working in a maximum-security prison in the western sector 
of Berlin. Psychologists also are employed in a few of the correctional homes for 
problem children and younger offenders. However, they are a small group numeri- 
cally. In the fledgling probation and parole services now developing in the country, 
as yet psychologists have played little part. 

Concerning private practice, a certain amount is being done. The writer visited 
with several practitioners who were university trained at the equivalent of our 
Master’s or Ph.D. level. Such private clinical work seemed to be done on a referral 
basis from courts, welfare agencies, or physicians. The private work of these in- 
dividuals was in addition to their regular academic jobs, however. Overall, the ex- 
tent of this private practice is quite limited. 

As for other settings, psychological services to industry are far more limited 
than in the United States. Such personnel with backgrounds in clinical psychology 
as have been employed are not being used as consultants utilizing a broad range of 
techniques to evaluate and select key employees, nor as staff counselors. Rather, 
they seem to be used primarily as handwriting analysts. Data on this point are 
somewhat limited, however. 

One of the main employers of non-academic psychologists at present is the 
network of Labor Offices in Germany. Universities excepted, the largest single 
group of psychologists in the state of Northrhine-Westphalia was working as testers 
and to a lesser extent as vocational counselors in the district Labor Offices. 


In the early stages of the last war, quite a few psychologists were utilized by the 
Armed Forces in service and research roles. Since postwar Germany has no regular 
military establishment but only a modest “‘police force”, the psychologists formerly 
so engaged have found employment instead in such other settings as Youth Offices, 
Labor Offices and the Psychological Institutes. 


THE FUNCTIONING OF THE CLINICIAN 


Clinical psychologists are functioning primarily as testers employing a variety 
of individual and group instruments. On examining these materials, a colleague 
from the United States is likely to get the impression that their objective measuring 
instruments are somewhat “‘behind the times” and not as well standardized as those 
with which we are accustomed to working. Generally, there seemed to be less at- 
tention paid to the availability of large-scale norms in tests used in the field. In 
addition, there appeared to be a greater element of subjectivity in the scoring and 
interpretation of objective tests. This point was discussed with several German 
psychologists who stressed the fact that they tend to concentrate more on the re- 
actions of the individual to the test situation and on qualitative performance and 
less on the objective test data than do their American counterparts. 

In appraising personality, psychologists there lean fairly heavily on instruments 
of a projective nature, mostly of European origin. Among these, the Sceno Test, 
which is similar to our World Test, figures prominently. Interestingly, the Rors- 
chach technique though widely used does not appear to be quite as generally em- 
ployed as it is here. It was not clear just what theoretical basis or frame-of-reference 
underlay the use of the projective devices there, but they seem to be used to quite an 
extent in an empirical way. In some settings such American tests as the Thematic 
Apperception Test and the Symonds’ Picture Story, as well as recent theoretical 
texts may be found despite the cost of these materials due to the limited purchasing 
power of the Deutschmark. However, little interest was evident in clinical settings 
visited in the questionnaire approach to the evaluation of personality. 
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Establishing the intelligence level and evaluating intellectual functioning is 
another well-defined area of operation. To aid in this, standardized tests of German 
origin are used as well as translations of American instruments. A version of the 
original Stanford has been available for some time, and there is considerable interest 
in a project currently under way at Hamburg to standardize German versions of the 
Wechsler-Bellevue and the Wechsler Intelligence Scale for Children. 

Among clinicians in Germany, graphology is an interest that is often in evi- 
dence. This probably can be traced to the fact that a couple of training centers 
focused especially on graphology. While generally more sympathetically disposed 
than psychologists here, occasional discussions indicated practitioners today regard 
the analysis of handwriting as a revealing though still-to-be-proven tool among 
methods for evaluating personality. 

Psychologists also are doing some therapy in Germany. Intensive counseling or 
therapeutic work appears to be done primarily in the newly-developed child guidance 
clinics and also in private practice. In one or two of the training schools for delin- 
quents the psychologists are engaged in counseling that is limited by the pressure of 
work loads. To the rather limited extent that psychotherapy is done in Germany, 
the question as to which discipline—psychiatry, psychology or psychiatric social 
work—should do it has not arisen to the extent that it has here. This is partly an 
outgrowth of the fact that German psychologists working in the clinical field see 
their role largely as that of diagnosticians. Moreover, because of the small number of 
psychologists in applied clinical work, they are neither a professional nor a financial 
threat to their psychiatric colleagues. Incidentally, the psychotherapeutic principles 
and methods employed were not too clear-cut: clinicians seemed to favor eclectic 
approaches ranging from extremes of the directive to the non-directive. 

Psychologists employed in vocational settings are engaged in some counseling 
of a relatively brief nature directed at helping clients to establish vocational ob- 
jectives besides testing intelligence and aptitudes. Again, the heavy load of cases to 


be served undoubtedly affects the batteries of tests employed and the pace and 
nature of the counseling process. 


FINAL OBSERVATIONS AND CONCLUSIONS 


German education above the eighth grade is highly selective and a much smaller 
proportion of the population attends full time school beyond that level. College 
training is rigorous. Instruction is characteristically thorough academically, but 
there is a lack of attention to the rounding out of training of the young psychologist 
by affording practical experiences in settings related to his field of specialization. 
Evidently the universities still operate largely on the assumption that theirs is the 
realm of theoretical psychology and that the preparation of students to serve in 
applied settings like child guidance centers, hospitals and clinics is less within their 
sphere. As a result, there are few internship opportunities affording supervised 
training coupled with actual experience. The Psychological Institute of the Uni- 
versity of Hamburg stands out among centers of its type in Germany in offering that 
sort of training. Though only several other corresponding units elsewhere in the 
country were visited, the Institute at Hamburg directed by Professor Bondy seemed 
to resemble American graduate training centers in clinical psychology most closely in 
method and spirit. 

Some general conclusions present themselves in reviewing the observations and 
experiences of three months in Germany. Despite the writer’s own restricted ability 
to make contact with German colleagues because of a language barrier, he was im- 
pressed by the caliber of the people working in psychology at the professional and 
the graduate student level. They are intelligent and intensely interested in their 
work. They were clearly quite devoted to their field and by no means preoccupied 
with financial return which affords them less in actual standard of living though 
their income in Deutschmarks corresponds roughly to the dollar income of psycho- 
logists in comparable positions in America. Those German psychologists who have 
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been to the United States or England through some exchange program seemed gen- 
erally quite sensitive to professional trends current in the United States and in the 
United Kingdom. While these ideas have not been invariably adopted, quite a few 
of the exchangees have been active disciples of the approaches they had had op- 
portunity to observe. Ironically, some exchangees were serving to reintroduce con- 
cepts of modern dynamic psychology into a country in which such thinking was 
effectively eradicated under Hitler. 

It was especially encouraging to see the concept of the team approach imple- 
mented with a reasonable degree of success in at least a few of the child guidance 
clinics. Not only is the team approach important as a method of operation, but it 
assumes special importance for Germany as it marks a departure from the more 
stratified, authoritarian pattern that has been so characteristic of the country as a 
whole. The pattern of collaborative effort of a group of specialists each contributing 
in his particular way to the understanding and re-educative work with the individual 
is one which will probably take considerable time to establish broadly there. 

It was also heartening to find a sensitivity for a need for soundly standardized 
tests in Germany. While American tests have been “adapted” for use, there was 
genuine recognition of the fact that something better than make-shift applications 
to the German population are needed. Even more significant than this attitude to- 
ward the techniques, was a reawakened interest in the individual coupled with a 
greater concern for his needs and his problems. This interest is by no means con- 
fined to psychologists, but is shared by forward-looking administrators in the welfare, 
social services, youth activities and correctional fields. 

One especially rich and enlightening experience stemmed from participating in 
a training seminar for workers in the child guidance field conducted by a British 
psychiatrist and psychiatric social worker. The discussion of basic concepts that 
took place served to underscore the differences that remain in the thinking of trained 
German professionals and a “‘pick-up’”’ Anglo-American team of clinicians. How- 


ever, we felt that we derived interesting insights in the interchange of ideas with our 
German counterparts. It is to be hoped that further exchange of our colleagues with 
workers in Germany—to allow for even more intensive collaboration and cross- 
fertilization of ideas—will be possible in the future. 





A VALIDATION STUDY OF THE TAYLOR MANIFEST ANXIETY SCALE! 
DONALD P. HOYT AND THOMAS M. MAGOON 
Student Counseling Bureau, University of Minnesota 


PROBLEM 


Taylor“) has recently reported on a scale to measure ‘“‘manifest anxiety’. The 
general acceptance of anxiety as a central concept in psychotherapy argues for the 
importance of being able to validly and reliably assess it. This report will describe 
an attempt to validate the Taylor Scale in a college counseling population. 


PROCEDURE 
Eight experienced counselors were provided with lists of those clients with 
whom they had had two or more counseling appointments in the previous six months 
period.2 The counselors were asked to select from this list only those clients they 
felt they knew well enough to rate as to the “degree of manifest anxiety” present. 
Manifest anxiety was defined as “those behaviors or characteristics of a client that 
lead you to classify him as: (a) Nervous (i.e., mannerisms such as nail biting, 


1This research was done under a grant from the Hill Family Foundation. 


*The authors wish to express their gratitude for the cooperation of the entire Student Counseling 
Bureau staff who made the judgments for this study. 
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knuckle-cracking, chain smoking; profuse perspiration; etc.); (b) Tense (i.e., unable 
to relax, continually working under pressure, hand trembling, tics, etc.); (c) Easily 
embarrassed (i.e., readily blushes, stammers, etc.); (d) Worried (i.e., apprehensive 
over what will happen from day to day; doubts self continually; etc.).”” These clients 
were assigned to one of three categories by their counselors: “High’’, ‘“Medium”’, 
and “Low” anxiety. 

All students in each group who had recently taken an MMPI constituted the 
comparison groups. Since the items from the Taylor scale were obtained from the 
MMPI, it was a simple matter to score these answer sheets on the Taylor items. 
Scores made by “High”, ““Medium’’, and “Low” anxiety groups were compared 
separately for each counselor. A combined group was also used. In all, a total of 289 
cases were studied including 88 with high anxiety, 115 with medium, and 86 rated as 
with low anxiety. 


RESULTS 


TABLE 1. MEANS AND SicmMas oF TAYLoR MANIFEST ANXIETY ScoRES FOR CLIENTS JUDGED AS 
“Hicn”, “Meprum’’, AnD “Low” By DIFFERENT COUNSELORS 
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Table 1 summarizes these data. For all counselors, the group judged as having 
“High” anxiety averaged higher than did the groups judged as ‘““Medium”’ or “‘Low’’. 
In addition, for six of the eight counselors, the medium group had an average Taylor 
score which fell between the two extremes. 

Table 1 also shows that the counselors differed considerably in the proportion of 
cases they assigned to each category. Thus, Counselor A rates only 5 of his 38 clients 
as having high anxiety, while Counselor G rates 28 of his 51 cases in this category. 
Are these differences due to different frames of reference among the counselors, or 
to the fact that some counselors see more highly anxious clients than others? 

A tentative answer to this question is obtained by an analysis of variance be- 
tween the 8 groups of clients for each of the three categories. If the counselors use 
different frames of reference in judging their clients, then we might expect one 
counselor’s ‘“‘Highs” to score differently than another’s. If, on the other hand, the 
frames of reference are all roughly the same, and the counselors simply see different 
numbers of “Highs” and ‘“‘Lows’’, then there should be little difference between 
counselors in the average score for a given group. 

The results lend support to the latter interpretation. In each of the three an- 
alyses (““High”’, ‘“Medium’’, and “‘Low’’), the scores made by the clients of different 
counselors were found to be homogeneous with respect to both the means and the 
variances. In fact, the F value was over 1.00 only in the case of the Low group, 
where F = 1.35. With 7 and 78 degrees of freedom, this value is far smaller than the 
2.12 needed for statistical significance. This statistical result is substantiated in 
part by the fact that Counselor G, who had such a disproportionate number of High 
anxiety cases, is the only clinical ‘psychologist included in the study. He often ac- 
cepts referrals of highly disturbed clients from those counselors who have had less 
clinical experience. 
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One more comparison was made using the data of Table 1. This was a compari- 
son of the mean scores of each of the three groups. In comparing the “Highs” with 
the other groups, the Behrens-Fisher ‘‘d’’ test was used, rather than the usual “t’’ 
test, since the ‘“High” group was found to be significantly more variable. The differ- 
ences between the “High” and ‘‘Medium” and the “High” and ‘‘Low” groups are 
both highly significant (d=5.10 and 6.79, p<.001). The difference between the 
“Medium” and “Low” groups, using the “t”’ test is not significant. 

How effective was the Taylor Scale in separating individuals into “High’’, 
““Medium’’, and “Low” groups as judged by counselors? Table 2 provides the 


TABLE 2. RELATIONSHIPS OF TAYLOR Scores TO CouNSELORS’ RATING OF DEGREE OF MANIFEST 
ANXIETY 








Taylor Scores 





Counselors’ Ratings 21 or more 20-12 11 or less N 
High 47 24 iy 88 
Medium 19 43 53 115 











Low 12 25 49 86 





N 78 92 119 289 




















answer. Here the Taylor scores have been trichotomized into scores at or above 
the mean of the “Highs” (21+), at or below the mean of the “Lows” (11-), and a 
third group between these extremes. 

From Table 2 it is seen that 47 of the 88 clients rated as ‘“Highly anxious” had 


Taylor scores of 21 or more. Only 17 had scores of 11 or less. By contrast, of the 86 
clients in the “low anxious” group, only 12 had scores as high as 21 and 49 had 
scores as low as 11. Accepting the counselor’s judgment of anxiety as accurate for 
the moment, we see that the Taylor scale ‘misses’ badly on 19% of the “Highs” 
and 14% of the ‘“‘Lows’’. 

The chi-square value for Table 2 is 50.64, which for 4 degrees of freedom is 
significant far beyond the .001 level. A general idea of the degree of relationship can 
be gained by referring to the contingency coefficient. This value is .39, and is raised 
to .47 when adjusted for the number of categories.“? This value can be taken as a 
rough estimate of what the correlation between the two sets of data would be if 
both had been treated as continuous variables. While this interpretation lacks 
statistical rigor, it is probably accurate enough to permit the generalization that the 
Taylor scale possesses reasonably high validity for our population within the limits 
of our criterion measure. 

A final question which this study attempts to answer is, ‘““‘Which of the Taylor 
scale items are functioning effectively for this group?” Table 3 provides the statis- 
tical answer. This table presents the percentage of students judged as having “High” 
or “‘Low” anxiety who answered ““True”’ to each item on the scale. It will be noted 
that there are two samples included in Table 3. Each sample is simply a random 
half of the total population of the “Highs” and ‘“‘Lows”. They were divided this way 
so that the analysis on Sample 2 could be considered as a replication of that on Sample 
1, thus controlling statistical errors to a greater extent. 

Table 3 shows that 17 of the 50 items did not reliably differentiate the ‘“Highs” 
from the ‘Lows’ on either sample. Sixteen items were significant for both samples, 
and the remaining 17 were significant on one sample but not the other. Of these lat- 
ter items, 14 out of 17 times the direction of the percentage difference was identical 
on the two samples, indicating the likelihood of validity for these items. Adding 
these 14 items to the 16 which were significant for both samples, a total of 30 of the 
50 items are obtained which probably have validity for this population. 
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TaBLe 3. Per Cent or “Hicn”’ anp “Low” Anxiety Supsects WHo ANSWER “TRUE” TO TAYLOR 
ScaLe ITeEMs 








aa Sample 1 Sample 2 
Items High Lows High Lows 
N=44 N=43 





Items significant at the 5% level or better in both samples 
I believe I am no more nervous than most others. 

I work under a great deal of tension. 
I cannot keep my mind on one thing. 

I am more sensitive than most other people. 
I frequently find myself worrying about something. 

I am usually calm and not easily upset. 

I feel anxiety about something or someone almost all the 
time. 
I am happy most of the time. 
I have periods of such great restlessness that I cannot 
sit long in a chair. 
I have sometimes felt that difficulties were piling up so 
high that I could not overcome them. 52. 
I find it hard to keep my mind on a task or job. 56.8 
I am not unusually self-conscious. 40.9 
I am inclined to take things hard. 65.9 
Life is a strain for me much of the time. 38.6 
At times I think I am no good at all. 60.5 
I am certainly lacking in self-confidence. 63.6 


Items significant at the 5% level or better in one, but not both samples 
I do not tire quickly. 59.1 
I have very few headaches. 81.8 
I frequently notice my hand shakes when I try to do 
something. 31 
I worry quite a bit over possible misfortunes. 

I am very seldom troubled by constipation. 

I have a great deal of stomach trouble. 

I have had periods in which I lost sleep over worry. 
My sleep is fitful and disturbed. 

I wish I could be as happy as others seem to be. 

I ery easily. 

It makes me nervous to have to wait. 

I have been afraid of things or people that I know could 
not hurt me. 

I certainly feel useless at times. 

I am a high-strung person. 

I sometimes feel that I am about to go to pieces. 

I shrink from facing a crisis or difficulty. 

I am entirely self-confident. 


Items not significant at the 5% level in either sample 

I am troubled by attacks of nausea. 

I worry over money and business. 

I blush no more often than others. 

I have diarrhea once a month or more. 

I practically never blush. 

I am often afraid that I am going to blush. 

I have nightmares every few nights. 

My hands and feet are usually warm engugh. 

I sweat very easily even on cool days. 

Sometimes when embarrassed, I break ut in a sweat 
which annoys me greatly. 

I hardly ever notice my heart pounding and I am seldom 
short of breath. 

I feel hungry almost all the time. 

I dream frequently about things that are best kept to 
myself. 

I am easily embarrassed. 

I — imes become so excited that I find it hard to get 
to sleep. 

I must admit that I have at times been worried beyond 
reason over something that really did not matter. 

I have very few fears compared to my friends. 
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It is possible, of course, that the 20 relatively non-valid items have validity for 
other populations. They may be useful, for example, in discriminating at the extremes 
of the continuum of manifest anxiety. But for this population, they seem to add little 
to the validity of the test. 


SUMMARY AND CONCLUSIONS 


Experienced counselors made judgments as to the degree of manifest anxiety 
present in clients they had recently seen. Taylor Manifest Anxiety scores for these 
clients were compared with the counselor’s ratings. In addition, an item analysis of 
the Taylor scale was done to determine the validity of the various items for this 
population. The results were as follows: 

1. There were highly reliable differences between scores made by clients judged 
to be “High” and those judged to be ‘““Medium” or “Low” on manifest anxiety. 

2. There were no differences between the counselors when the variances and 
means of their “High” groups are compared. The same is true of their “Medium” 
and “Low” groups. 

3. When Taylor scores are trichotomized and compared with counselor ratings, 
the adjusted contingency coefficient is .47. 

4. Thirty of the fifty individual items on the Taylor Scale appear to have val- 
idity for this population. 
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A QUALITATIVE ANALYSIS OF THE VOCABULARY RESPONSES OF 
INSTITUTIONALIZED, MENTALLY RETARDED CHILDREN! 


NED PAPANIA 
Wayne County Training School, Northville, Michigan 


PROBLEM 


The purpose of this study was to determine whether institutionalized, mentally 
retarded children follow developmental trends similar to those evidenced by children 
of average intelligence (‘‘normals’’) in their ability to define words abstractly and 
to determine whether this ability was directly predictable from the MA. 

Empirical observations of the high grade mentally retarded children at the 
Wayne County Training School seem to indicate that the mental age is not an ac- 
curate predictor of academic achievement. Generally speaking, the children often 
reach their educational plateaus in reading and arithmetic at some point below that 
which might be expected solely on the basis of the MA. Realizing that this phen- 
omenon is undoubtedly quite complex, it nevertheless seems plausible that it is in 
part due to a more concrete manner of thinking. If the abstract definition of words 
is a reflection of this factor as Piaget “: 5) and Werner“ seem to indicate, it would 
follow that the mentally handicapped group might be less abstract in defining words 
than would a group of “normal” children of the same mental age. The general 


1The author wishes to thank Dr. Thorleif G. Hegge, Director of Research and Education, Dr. 
Sidney Rosenblum and Mr. J. Edwin Keller for their assistance with this paper. The author also wishes 
to express his appreciation to Robert H. Haskell, M. D., Medical Superintendent of the Wayne County 
Training School. 
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hypothesis of this study is derived from the above inference, viz: The ability to 
define words abstractly will develop more slowly in a group of endogenous, mentally 
retarded children than it will in a group of children of average intelligence at the 
same MA level. The specific hypotheses are: 


1. There will be an increase in the proportion of synonym and explanation 
(abstract) definitions within the mentally retarded groups as MA increases. 


2. There will be a decrease in the use, description, demonstration and 
repetition (concrete) types of definition as MA increases. 


3. At a given MA level, the “normal” group will give a significantly greater 
number of abstract definitions than will the mentally retarded group. 


4. At a given MA level, the “normal” group will give significantly fewer 
concrete responses than the mentally retarded group. 


PROCEDURE 


The Binet vocabulary responses of fifty, white, ‘non-brain-injured”’ children 
(IQ 65-75), at each MA level from six through ten, were analyzed by Green’s method 
as modified by Feifel and Lorge“.2 The vocabulary test used was part of the rout- 
inely administered Stanford Binet. All definitions had been recorded verbatim. 

The characteristics of the mentally retarded group are presented in Table 1. 
The data obtained by Feifel and Lorge on ‘‘Normal”’ children “ were used in making 
comparisons. These children (100 children at each C.A. level from 6 through 14) 
obtained mean Binet I1Q’s ranging from 100 to 104. It thus became possible to assume 
that CA was equal to MA and to compare these results with those obtained from 
mentally handicapped children at the several MA levels. 


TABLE 1. CHARACTERISTICS OF THE MENTALLY HANDICAPPED GROUP 








M. A. C. A. 
Groups (In Years) (In Years) ../- 





Mean 8. D. Mean 8. D. 





M. A. 
M. A. 


6.45 .23 9.24 57 





6 

7 7.42 .62 10.63 .66 
M.A. 8 8.38 .70 12.05 87 
M.A. 9 








9.41 .79 13.61 .85 
M. A. 10 10.28 86 15.87 .70 





























*These groups were matched as closely as possible for IQ. 


A detailed account of the criteria for the various categories of the vocabulary 
analysis may be found in Feifel’s monograph ®, or in the study by Feifel and Lorge 
me Es ed categories in terms of which this analysis was made are described 
in Table 2. 

Reliability of the scoring was obtained by randomly picking a sample of 50 
records to be independently scored by another psychologist. The percent agreement,® 
as determined by Arrington’s formula“? was .97. 


*Twenty-five boys and twenty-five girls were selected at each MA level. As there were no statisti- 


cally significant, qualitative, differences between the vocabulary responses of the two sexes, the re- 
cords were combined. 


3 2x agreement 





2x agreement and disagreement. 
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TABLE 2. QUALITATIVE CATEGORIES USED IN THE ANALYSIS OF BInET VocABULARY WORDS 








Category Example 





1. Synonym Orange = a fruit 





2. Explanation Priceless = It’s worth a lot of money 





3.* Use Orange = You eat it 
Description Orange = It’s round 





4.* Demonstration Eyelash = S. points to lash 
Repetition Puddle = Puddle of water 


5. Error Wrong definition 
Clang association 
Omission 
Incorrect demonstration 











*Use and Description have been combined in the final scoring as have Demonstration and 
Repetition in keeping with Feifel and Lorge’s method. 


The qualitatively analyzed responses for each MA group were converted into 
proportions of correct responses. The proportion of right answers was also calculated. 
Since omissions are counted as wrong answers in this scoring system this latter pro- 
portion was obtained by dividing the total number of words in the vocabulary test 
(forty-five) into the number of correct answers. It was also necessary to convert 
Feifel and Lorge’s®) data into proportions. The significance of differences between 
these proportions were computed by means of the one-tailed ¢ test. 


RESULTS 


The increase in the more abstract definitions and the decrease in the more con- 
crete definitions with increasing MA may be noted in figures 1 and 2. The statistical 
significance for this rate of change for the “normal” groups may be found in the 
article by Feifel and Lorge®. Table 3 indicates the significance of change for the 

Fic. 2. Proportion or “Concrete” (UsE AND 


Fic. 1. Proportion or “Asstracr’ (SyNONYM DeEscriPTION; REPETITION AND DEMON- 
AND EXPLANATION) RESPONSES STRATION) RESPONSES. 


©---O CAT 3 RETARDED 
@---@ Cat. 3 “WORMAL” 


PROP. OF “ABSTRACT” RESPONSES 
PROP. OF “CONCRETE” RESPONSES 














MENTAL AGE 
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Tasie 3. SIGNIFICANCE OF CATEGORY PROPORTION DIFFERENCES BETWEEN MA Groups 








C. A. Groups 





MA Groups } 
Categories 8 


1 i 













































































2 
3 
+ 
5 

















{The normal groups are of the same MA as the group to which they are compared. 
*Significant at .05 level (1 tail test) 
**Significant at .01 level or beyond (1 tail test) 


retarded groups. The use of synonyms increases at a statistically significant rate 
for each two years of mental age. The 6-7 year difference is also significant. The 
use of explanatory definitions follows this same pattern except for the 6-8 year com- 
parison which is not significant. ; 

In concrete definitions the use and description categories significantly decrease 
for each two years of MA growth except at MA 8. The repetition and demonstra- 
tion categories show no statistically significant differences and in fact show a re- 
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versal at MA 8, though in general the trend is in the predicted direction. The first 
two hypotheses thus seem to be confirmed. 

There were no significant differences in the proportion of correct responses be- 
tween the retarded and the “normal” groups. In fact the retarded group gives slight- 
ly fewer wrong definitions than does the “normal” group up to an MA of 9. All of 
the qualitative comparisons between the retarded and ‘‘normal” groups reach an 
acceptable level of statistical significance. (See Table 3). Thus at all MA levels, 
the ‘‘normals”’ respond with proportionally more synonym and explanation type of 
responses than do the retarded. At all MA levels except MA 6, they give fewer use, 
description, repetition or demonstration responses than do the retarded. This con- 
firms hypotheses 3 and 4, and leads to the conclusion that though the number of 
correctly defined words is closely related to the MA, the qualitative level is not. 


DIscuUssION 


The results of the present study seem to indicate a real difference in abstract 
verbal behavior between children of average intelligence and those of lower intelli- 
gence, at least insofar as this is measured by the verbal definitions of an institutional- 
ized population. This relative concreteness may be a factor in the academic under- 
production on the part of these retarded boys and girls. 

At present we can only speculate on the factors which may be related to this 
phenomenon. Certainly social class memberships, emotional problems or perhaps 
mental retardation per se are possibilities. Only further research can clarify these 
relationships. 


SUMMARY 


The Binet vocabulary responses of five groups of 50 institutionalized “endo- 
genous” mentally retarded children at all mental ages from 6 through 10 were 
analyzed by Green’s method“. The differences between mental age levels were 
examined. These groups were also compared to groups of children with average 
1Q’s previously reported on by Feifel and Lorge®. 

The one tail ¢ test was employed to ascertain statistical significance. Within 
the mentally retarded group it was found that (a) “Abstract” definitions generally 
increased with an increase in MA; (b) “Concrete’’ definitions decreased with an in- 
crease in MA. 

When compared to “normal” groups of like MA, the retarded groups differed 
significantly in the following ways: They gave (a) fewer “abstract” definitions, and 
(b) more “concrete” definitions to the Binet vocabulary words. There were no 
differences between these groups in the proportion of correct definitions. Thus while 
the number of acceptable definitions seems to be directly related to MA, the qualita- 
tive level of the response does not. 
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A TRANSPOSED ANALYSIS OF THE BENDER GESTALTS 
OF BRAIN DISEASE CASES! 
WILSON H. GUERTIN 


Veterans Administration Hospital 
Knozville, Iowa 


PROBLEM 


Clinical psychologists and psychiatrists have been rather prone to ascribe cer- 
tain signs and symptoms to those groups of patients known as “‘organics.”’ A dis- 
regard of the varied etiologies and the foci of pathology could only be justified by 
the view of the brain as a simple, all-or-none, functioning mechanism. Witten- 
born®), upon studying psychiatric symptoms, found that there were six different 
factor-types of “‘organic” patients. These types of organic reactions seemed un- 
related to the known etiologies. The present study also employs factor analysis but 
investigates a transposed matrix of the performance of a group of “organic’”’ persons 
on the Bender Gestalt. 

METHOD 


The study employed 27 resident male patients with a variety of diagnoses of 
organic brain disease. No particular attempt was made to control any of the in- 
cidental case-history variables, and the duration of illness was a year or more. Paret- 
ics were avoided because of the rather unique and vivid psychiatric symptoms often 
displayed, and only one atypical case was included. Patients with rather clear evi- 
dence of organic brain disease were selected to provide a sound basis for designating 
the sample as “organic.’”’ While there were only 27 patients used, one of the sub- 
jects had a Bender retest available which showed considerable general change in 
nine months’ time. By including this retest the N of the study becomes 28. 

The test was administered in a standard fashion“. Each record was scored 
with respect to 100 different items. Distortions on the various figures were evaluated 
by a method which follows conventional scoring procedures, constitutes a rather 
exhaustive analysis of each record, and has been developed by combining items re- 
ported elsewhere ®: * 4), 

Tetrachoric intercorrelations between the individuals were obtained from four- 
fold contingency tables and charts. The resulting intercorrelation matrix was factor- 
ed by the multiple-group centroid method “’. Communalities for those individuals 
within a cluster were first estimated by employing the individual’s largest correla- 
tion within the cluster. Those outside the clusters had communality estimates based 
upon the highest intercorrelation in the whole column. Re-estimates of the com- 
munalities were made after the first factoring, and plots of the factor space were 
studied in order to provide more stable clusters. 

Psychiatric characteristics were assessed by the Malamud-Sands Rating Scale “ 
at the time of Bender administration. The rating scales provided the behavioral 
symptom information about each of the patients necessary for recognizing those 
personality characteristics underlying each of the types of Bender performance. 


RESULTS 


The matrix was factor analyzed roughly for the first time so that the basic 
factor structure would be révealed. The refactoring by the multiple-group centroid 
method did not modify these original factors appreciably, and the final oblique 
matrix is presented in Table 1. 

A fair amount of communality was extracted from the residual correlation 
matrix by a complete centroid. However, this factor was so heavily correlated witn 
the C-cluster of the multiple-group factoring that a clinically meaningful rotation 
could not be made. In the hope of concentrating on the more clinically significant 
features, further factoring was not attempted. 


__ ‘From the Veterans Administration Hospital, Knoxville, lowa. The author is indebted to Victor 
Zilaitis for his collaboration in the development of the initial design of this study. 
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TABLE 1. Osuique Factor Matrix or Types or Orncanic BENDER PERFORMANCE* 








Type-Factor 


A: With Curvilinear} B: With Spatial C: With Constr. 
. Etiological Distortions Disability and Loss | and Feelings of 
Diagnosis of Control 








Trauma 23 
Trauma 16 
Undetermined 49 
Trauma 

Trauma 

Trauma 

Infectious 

Trauma 

Alcoholic Korsakow 
Infectious 

Infectious 

Senile 

Huntington’s Chorea 
Cerebral Arteriosclerosis 
Anoxia 

Trauma 

Infectious 

Epilepsy 

Epilepsy 

Trauma 

Infectious 

Infectious 

Trauma 

Paresis 

Trauma 

Cerebral Arteriosclerosis 
Infectious 

Cerebral Arteriosclerosis 

















*Decimal points omitted. 


The original communality estimates accounted for almost half the total var- 
iance of the individuals, and the final factoring accounted for 73 percent of the orig- 
inally estimated communality. The intercorrelations of the oblique factors were as 
follows: .44 for A with B, .39 for A with C, and .31 for B with C. 


Discussion 


The moderately high communality found among the subjects of the study sug- 
gests that the Bender-Gestalt is successful in pointing out common features in the 
performance of these organic patients. These common features need not necessarily 
be based upon organic impairments and might be found among non-organic patients 
or even normals. However, study of each of the derived factors suggests that the B 
factor is clearly related to organic deficit. The other two factors can be indirectly 
related to organic brain damage but could conceivably be found in other populations. 

No general factor of organicity seems to be disclosed by the data even though 
factor A is loaded .20 or better on all but five individuals. While this constitutes a 
rather large group-factor, its Bender and psychiatric characteristics are not those 
commonly regarded as differential for organics“. The Bender-Gestalt objective 
scoring system for the derived factors will not adequately describe any organic brain- 
diseased patient although some partial classification system might be suggested. The 
factor-types are as follows: 


A—Organics with Curvilinear Distortions. This type of patient shows a pre- 
dominance of distortions of the curvilinear type. Such distortions with their psycho- 
logical implications are discussed more fully in a previous publication“). On both 
figures 1 and 2 there is an inability to maintain a horizontal slope; on figure 3 a sort 
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of torsion seems to exist to produce non-parallel lines, and on figure 6 there is contour 
asymmetry. There is no decrease of the major curvature on Figure 4, and heavy 
pencil pressure is seen. In contrast to the next type there is good juxtaposition of 
the elements of figure 7. Psychiatrically, such a patient shows ‘more tension than 
those in the next two groups. It is hypothesized that this type of patient is rather 
unstable and impulsive, and that considerable tension underlies the apparently 
careless productions. Quite a few moderate and mild distortions appear. The Bend- 
ers do not look distinctively organic. Those most characteristic of the group were 
five post-traumatics and one undetermined type. This one factor is rather closely 
related to the etiology of the brain disease. In other words, post-traumatic cases 
are more likely to show this type of performance than either of the other two factor- 
types. (Mann-Whitney U-Test with a level of confidence of 2.5 per cent). Except 
for this relationship between traumata and the A type, no correspondence between 
etiology and type-factors was encountered. 


B—Organics with Spatial Disability and Loss of Control. This type of patient 
does his best to conform to the requirements of the testing situation, but nonemo- 
tional organic factors impair his performance. Irregularity and a shifting in the verti- 
cal slope of figure 2, as well as perseveration and number distortion in figure 2, point 
to organic memory deficit and inability to maintain a set. This type of person is 
inclined to an expansive use of space, particularly extending the drawings to the 
right-hand margin, and experiences no restriction on the size of figure 6 as does the 
individual in the next cluster. The number of distortions on the Bender is high. 
Psychiatrically, this type shows the most deviant behavior, and appearance is un- 
tidy. Deficiencies are unrecognized, and consequently there is more out-goingness 
and expressiveness than for either of the other two types. Judgment and planning 
are impaired. The four most typical individuals have an etiology distributed among 
the following four organic types: post-trauma, post-anoxia, cerebral arteriosclerosis, 
and Huntington’s Chorea. 


C—Organics with Constriction and Feelings of Inadequacy. This type of individ- 
ual is careful and constricted. He avoids over-extending himself, but minor dis- 
tortion items are encountered. The productions seldom appear to be characteristic- 
ally organic. Erasure is present, demonstrating self-criticalness, and the drawing is 
crowded to the left margin, often very markedly. Relatively few distortions or ex- 
cessive amounts of time in drawing are encountered. Psychiatrically, this type of 
patient engages in less motor activity than the other two groups, but particularly in 
contrast to the B group. All expression seems to be more limited with this type of 
patient; e.g., there is an absence of overt hostility. In a similar vein, this type re- 
mains much more isolated than the B group does or even than the A. There is some- 
what more dysphoria than for the others. Three of the most typical cases were post- 
traumatic while the other two were post-infectious and paretic. 

In summary, it is felt that the results of the study are not sufficiently definite 
to suggest any classificatory system for organic reactions, but some start has been 
made. It is difficult to compare the results of this study with the previous one by 
Wittenborn since he used “‘newly admitted patients’ and did not sample post- 
traumatic behavior. Also, his factors appear in an orthogonal representation. Wit- 
tenborn’s factor I, instability with depressed and excited features, was a fairly large 
one as is the A factor of this study (Organics with Curvilinear Distortion). It seems 
likely that there is a correspondence between these two. Possibly his rather small 
factor III, which is a mixture of excitement-depression, hebephrenic schizophrenic, 
and conversion hysteric, may correspond to the B type of this study (Organics with 
Spatial Disability and Loss of Control). The C type of this study (Organics with 
Constriction and Feelings of Inadequacy) could not be matched with any on his list 
of six factors. The nature of this factor suggests that it would be found in a popula- 
tion that had more time to adapt to organic deficits through employing constriction. 
It would not be surprising to find that the factor was specific for a chronic popula- 
tion such as the one in this study, and absent in studies of new admissions. The in- 
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creased loading on C for one subject as he adapted to his disabilities in the course of 
time (subject number 28 going to number 4, Table 1) would support this interpreta- 
tion. 


SUMMARY AND CONCLUSIONS 


1. The Bender-Gestalt performance of 27 male schizophrenics was subjected 
to a transposed factor analysis. It was hoped that information about natural group- 
ings of these individuals would be revealed and that types of organic patients might 
be disclosed. Ratings of psychiatric characteristics provided information about 
patients with particular types of Bender performance. 


2. Factor analysis disclosed a fairly large commonness among the Bender 
performances of the individuals in this study. Three types of organics were en- 
countered. They are as follows: A. Organics with curvilinear distortions related to 
emotional instability, B. Organics with spatial disability and loss of control, related 
to personality disorganization, and C. Organics with constriction and feelings of 
inadequacy related to ego compensation for recognized deficits. 


3. The first was a fairly large group factor, but its characteristics were not 
those commonly attributed to organics. This factor seems to be significantly re- 
lated to traumatic etiology. The results of the study are discussed with respect to a 
previously reported transposed analysis of the psychiatric characteristics of organics. 
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THE BENDER GESTALT VISUAL MOTOR TEST AS A CULTURE FREE 
TEST OF PERSONALITY 


H. E. PEIXOTTO 


Catholic University of America 


INTRODUCTION 


There appears to be an assumption basic in many studies that non-verbal stim- 
uli are “culture free.”” Thus, many projective tests are used with equal aplomb for 
patients of all cultures, in spite of such studies as those of Harrower“), Linder- 
felt ©), Reiman™ and others whose results point to the contrary. It appears that the 
Bender Gestalt is as culture free as any test. Since a minimum of language is necess- 
ary to accomplish the task, it might be assumed that all individuals of equal intelli- 
gence would be equally capable of reproducing the figures accurately. The technique 
is used as a diagnostic tool to evaluate personality as well as to estimate intelligence 
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and it is important, therefore, to know whether all protocols should be based on the 
same standards. The question, then, is whether the Bender Gestalt reflects cultural 
differences among several ethnic groups, or whether variations in the reproductions 
can be considered indicative of intellectual level and personality dynamics in spite 
of cultural differences. 


PROCEDURE 


In order to investigate this problem seven cultural groups were used, five sub- 
jects in each group. The subjects were all male and were chosen to fall within specific 
ranges for age and intelligence. The data concerning subjects are presented in Table 
1. The thirty-five subjects were all referrals to the Psychological and Psychopathic 


TABLE 1. Data CONCERNING THE 35 SuBJEcCTS 








Group Age Range IQ Range 





Chinese 15-19 yrs 88-135 
Japanese 15-24 98-114 
Caucasian 15-17 94-106 
Part-Hawaiian 15-26 87-125 
Portuguese 14-31 86-102 
Chinese-Hawaiian 15-19 82-106 
Filipino 16-18 90-107 








Clinic of the University of Hawaii. It should be noted that in the Hawaiian Islands 
there is a distinction made between caucasian Portuguese and the so-called ‘‘Portu- 
guese’”’ who seeim to have some negro blood, are probably from Cape Verde and were 
originally imported as laborers for the plantations. It is these latter who have been 
referred to as Portuguese among the cultural groups. 

It was planned at the outset to keep the age range between sixteen and twenty- 
five years. However, expediency made it necessary to expand the range at both 
ends of the distribution. Although the final age range is rather wide, it is felt that it 
is justified in that sufficient maturation has taken place by fourteen years so that the 
subject is capable of reproducing the Gestalten without difficulty. The IQ range was 
determined on a similar principle, that is, subjects were chosen who had sufficient 
intellectual development to reproduce the Gestalten accurately. Therefore the lower 
limit of the IQ decided upon was 80, the lower limit of normal intelligence, and no 
upper limit was set. The number of subjects, five, in each group was entirely a mat- 
ter of expediency with the controls set up as mentioned above. 

The tests were given in the ordinary clinic setting. The subjects were clinic 
patients who had been referred for a variety of reasons. The intelligence tests were 
scored as part of the clinic routine. The Bender Gestalt records were not scored until 
sometime later and were done specially for this study. Asa matter of fact, the Pascal- 
Suttell scoring system had not been published until after the data had been collected. 
The scoring system follows that of Pascal and Suttell excluding design A. However, 
raw scores are used as it seemed little would be gained for this study by transposing 
them to standard scores. 

In order to determine the most appropriate measure of intelligence for this 
study, Pascal and Suttell‘” raw scores on the Bender Gestalt were correlated with 
educational level, the 1Q from either the Stanford-Binet as revised by Porteus or the 
Wechsler-Bellevue Form I, and the IQ from the Porteus Maze. The correlation of 
two verbal tests (the Porteus Stanford-Binet and the Wechsler-Bellevue results may 
be referred to as Abstract Intelligence) and the Bender Gestalt is significant at the 
.01 level of confidence, while the correlation with educational level is significant at 
the .05 level. The correlation with the Porteus Maze is .19, which is not significant. 
These results necessitated changing the criterion for selection of subjects from practi- 
cal ability as measured by the Porteus Maze to Abstract Intelligence. 
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RESULTS 


These data lend themselves most readily to analysis of variance to answer the 
question posed in this study, i.e., to indicate whether or not there is any significant 
variation among the cultural groups. The results presented in Table 2 indicate that 


TaBLeE 2. ResuuLts or ANALYSIS OF VARIANCE 








Source Sum Squares | df. Mean Squares F 





Between Rows 
Gestalten) 222.98 27.873 2.146 


Between Columns 
(Nationalities) 175.33 29.223 2.50 


Interaction 602.03 12.542 
Within 3272.60 ‘ 12.987 
Total 4272 .95 


























variation among the different cultural groups is significant at the 5% level of con- 
fidence. Thus, although the number in the groups is small there is a fair degree of 
certainty that the variation among the different nationalities is not due to chance, 
but that the variation in scores on the designs is attributable to specific factors which 
vary from group to group. Similar results were found among the Gestalten of the 
test; the results are significant at the 5% level of confidence. Thus the designs them- 
selves vary considerably in difficulty, at least as measured by the Pascal-Suttell 
scoring system. This finding, in a sense, conforms with Bender’s®? view that the 
Gestalten represent maturational elements. The interaction is not significant sug- 
gesting that the variance found is quite evenly spread among both the ethnic groups 
and the Gestalten. 

The significance of the differences between the mean scores on all items made 
by various pairs of nationalities was determined by means of ‘‘t” tests. Only five 
differences significant at the 59% level or better were obtained. In view of the number 
of possible differences that might be computed one would expect at least this many 
by chance. Although these findings are not clearly significant they have been in- 
cluded because some interesting dynamics are implied, and for comparison with 
other studies which might be done in this area. 

The first Gestalt included in the Pascal and Suttell scoring system, a line of 
paired dots, differentiated the Chinese from the Filipinos at the .01 level of con- 
fidence, and the Portuguese from the Filipinos at the same level. Item four, an 
open square and curved line differentiated the Japanese and Chinese at the .05 level. 
The item defined as Configuration, general appearance of the complete protocol, 
differentiated not only the Japanese and Chinese, but also the Japanese and Fili- 
pinos both at the .01 level of confidence. 


Discussion 


It seems questionable, from the results presented above, whether even a test as 
apparently culture-free as the Bender Gestalt is, in actuality, completely free from 
cultural influences. The levels of significance obtained are of such magnitude that 
it is felt that the universality of interpretations of protocols from this technique 
should be questioned. 

Differential findings among various ethnic groups have been found with other 
projective tests. Many of these studies with the Rorschach are well known. One such 
study less well known is that of Linderfelt ©) who found certain Rorschach determ- 
inants to be racially determined while others seemed to be influenced by locale or 
habitus. Lowenfeld reported a difference between American and English subjects on 
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the Mosaic Test, as did Reiman®) working with English, Eskimo and American 
children. 

Studying the Gestalten did show a significant difference among the various 
groups; certain stereotypes are sameal while others are not. According to the 
interpretation of the designs by Suczek and Klopfer®® one would surmise that the 
Chinese and Portuguese are conforming individuals and that the other cultural 
groups range along a continuum of conformity with the Filipinos the least conform- 
ing. Difficulty with Figure 4 is indicative of ‘‘the presence of internal incongruities 
as a source of anxiety.’’“°) The Chinese and part-Chinese scored highest on this 
figure while the Japanese had least difficulty with it. This finding does not seem 
consistent with any obvious cultural-anthropological hypothesis. 

The third item, ‘Configuration,’ refers to the overall performance on the test. 
This item is frequently weighted heavily in estimating intellectual level of the sub- 
ject as well as other personality dynamics. Most of the errors on this item were 
caused by lack of orderliness in copying the Gestalten. If we assume orderliness to be 
indicative of intellectual level, the scores are consistent for the Japanese since they 
are the subjects with the highest mean intelligence rating among the subjects and 
also the highest mental age and maturational level of development. This is also con- 
sistent with findings from other studies in which the Japanese have scored above the 
other ethnic groups on tests of intelligence“. On the other extreme the results are 
consistent with the stereotype regarding the Filipinos in this community and with 
the results of other studies but not with the intellectual level or maturational level 
among the subjects of this study. The high error score obtained by the Chinese is 


not consistent with either the expectations from data in Table 1, or the results of 
other studies. 


SUMMARY AND CONCLUSIONS 
The present study is an attempt to show the degree to which a non-verbal, pro- 


jective technique can be considered culture free. The 35 subjects represented seven 
different ethnic groups all residing in the Territory of Hawaii. The experimental 
design includes presentation of the Bender Gestalt Visual Motor Test in the standard 
clinical manner and scoring protocols according to the Pascal-Suttell system. The 
results are treated by analysis of variance. 

The results indicate variances which are significant at better than the 5% level 
of confidence and which, while not definitive because of the small N, do suggest the 
probability that various ethnic groups will produce different protocols, so that in 
this sense, the technique is not culture-free. A suggestion was indicated regarding 
interpretation of personality dynamics on a group level according to caucasian stand- 
ards. The discrepancies in the accepted stereotypes and results of other studies 
would indicate that this area is a fertile field for ethnological pursuit. Further studies 
following the lines of this study are indicated using a more adequate sample of sub- 
jects and also combining the technique of Suezek and Klopfer for a complete under- 
standing of the results of the Bender Gestalt with various cultural groups. 
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RATING SCALE TECHNIQUE APPLIED: TO RORSCHACH RESPONSES 


LAWRENCE M. BAKER AND JOHN A. CREAGER 


Purdue University 


PROBLEM 


The purpose of this study was to develop a rating scale technique that could be 
applied to Rorschach responses and then to compare the results obtained by this 
means with the Rorschach records obtained from the same subjects by the con- 
ventional method. Several attempts have been made to adapt Rorschach tests for 
rapid screening purposes®: * * 7), Usually speed and objectivity of scoring have 
been the primary aims. It was our intention to retain as many of the advantages 
thought to characterize projective tests as possible; to allow for self administration 
and to make scoring entirely objective. 


PROCEDURE 


Rorschach cards were fastened in a loose-leaf type of binder by rings so the 
test could be self-administered. A list of responses taken from previously obtained 
Rorschach protocols was prepared for the subject to rate. The instructions to the 
subject were placed before the statements to be rated and were as follows: 

“You have been given a set of ten cards with designs on them made of inkblots. For each 
card there is a page in this booklet containing a series of items telling what the design on the 
card might be. Read the listed item and rate it according to how much the item is like what 
you see in the inkblot. Use the following scale: 4-excellent; 3-good; 2-fair; 1-poor; 0-do not see 


it. Place the number of your choice in the column at the right. Be sure to rate every item, but 
do not spend too much time on any one item.” 


A pilot study was run to check upon procedure. We found relatively little 
difficulty among subjects in identifying the location on the figure to which the res- 
ponse referred. From the pilot study, items were selected for their discriminating 
power and one hundred fifty were retained. Of the items retained, distribution was 
a -g over all the cards with frequencies consistent with those reported by 

eck “), 

The scoring of the items was fixed with respect to the determinant categories 
by prescoring the items by the Beck system. A given subject’s rating of a given 
item, e.g. a “3”, contributes to the score in each determinant category involved in 
that item. The item ratings in each category are then summed to obtain a total of 
ratings for that category. Thus, a “3” rating given for ‘‘the bat” on card I contri- 
butes three points each to scores in Whole (W), Form quality (F+), Animal (A), 
and Popular (P), just as such a response on an individual Rorschach protocol would 
contribute one point for each determinant on the summary sheet. The rating scale 
therefore represents a battery of quasi-Likert scales, one for each determinant. The 
items are thus being weighted by the ratings assigned by the subject. For obtaining 
a total production score upon which to base percentages, and by which to measure 
response output analogous to the number of response ratings, all items are summed 
regardless of scoring categories. 

Two samples of population were used. The first consisted of 32 students from 
a course in general psychology. The first group was used for studying reliability. 
The second sample consisted of 120 male students in first year engineering who were 
given the Rorschach Test consistent with the Beck system of administration and 
scoring. The second group was used in a validation study in which results from the 
rating scale were validated against results obtained from the Rorschach test. The 
Rorschach Test was presented two weeks before the Rating Scale since it was assumed 
that this order would be less likely to distort results. There still might be some 
residual influences from the Rorschach Test to the Rating Scale but this, it was 
thought, would be of no great importance relative to our present task. 
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The assumption underlying the procedure is that the subject will tend to give 
high ratings to those itemS he would give when taking the Rorschach without sug- 
gestion, and that his rating isan index to perceptual acceptance which reflects person- 
ality characteristics. Our hypothesis may be stated as follows: There will be a re- 
liable positive relationship between the frequency of Rorschach responses on the 
summary sheet and the rating of quality of typical Rorschach responses. 


RESULTS 


The reliability studies involved two estimations of reliability for each determin- 
ant category. Using the sample of 32 subjects, the internal consistency of the raw 
ratings was estimated by the analysis of variance method of Hoyt. The reliabil- 
ities thus obtained are equivalent to those from the Kuder-Richardson Formula 20, 
and range from .74 to .97 with a median of .83. Another estimate of reliability was 
made by treating the percentage scores in the various categories as test scores, and 
test-retest reliabilities were computed using two weeks as the inter-test period. These 
coefficients range from .51 to .90 with a median of .78. The median reliability for the 
various category scores as obtained by two different estimates is thus .80 and in- 
dicates that the method is capable of yielding reproducible and consistent data. 
The percentage scores in each category were tested for normality by Chi square. 
Of 24 scores, 22 met the test of normality at the five per cent level. 

The validation study is here conceived as testing the hypothesis that the various 
categories on the Inkblot Rating Scale were measuring the same things as the in- 
dividual Rorschach procedure as the latter is revealed in the summary sheet. In 
addition the experiment involved answering the question of the degree of corres- 
pondence between conventional Rorschach and Inkblot Rating Scale scores using 
correlational techniques. In making these comparisons the Rorschach scores were 
treated by the methods suggested by Cronbach ®)?. 

Raw scores and percentages from the Rorschach protocols were correlated 
with the corresponding raw scores and percentages from the Inkblot Rating Scale. 
For the categories in which these tests were made, the correlations obtained were 
low but positive mostly ranging from .20 to .30, and significant at the five per cent 
level. It is clear that, while some significant overlap exists between sets of compar- 
able scores, the correlations are too low to justify interpreting a score from the Ink- 
blot Rating Scale as psychologically equivalent to the corresponding Rorschach 
score. It was not, of course, our purpose to use this test as a substitute for the 
Rorschach. The Rating Scale scores are weighted according to intensity by the 
subject’s ratings, whereas the summary sheet scores from the Rorschach Test treat 
each response with unit weight. 

It was found that the mean administration time for 120 subjects was 27.5 min- 
utes with a standard deviation of 9.8 minutes. Since as many subjects could be 
tested at one time as there are sets of materials and the answers could be machine 
scored, the method lends itself to great speed. 


SUMMARY AND CONCLUSIONS 


This study attempts to apply a rating scale technique for administering and 
scoring the Rorschach test by a group method. The modified technique involves 
presentation of the Rorschach cards with instructions to the subject to rate presented 
responses on a five point rating scale. Most of the determinants of the Beck system 
are preserved. 


From the results of the present study we conclude: 


1. The rating scale in which Rorschach Test responses are used is a highly 
reliable measuring instrument yielding low, positive, and statistically significant 
relationship with Rorschach scores in the corresponding determinants on the Beck 
summary sheet. 
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2. It appears feasible to construct various projective test items such as ink- 
blots or pictures, select responses from the administration of these items, then place 
these responses in a rating scale which can be expected to yield reliable and valid 
results for the measurement of personality traits similar to those thought to be 
measurable by projective tests. 


REFERENCES 
Beck, 8. J. Rorschach’s test: I. Basic processes. New York: Grune and Stratton, 1944. IT. A 


variety of personality pictures. New York: Grune and Stratton, 1945. 


Cronsacu, L. J. Statistical methods applied to Rorschach scores. Psychol. Bull., 1949, 46, 395- 
429. 


Eysenck, H. J. A comparative study of four screening tests for neurotics. Psychol. Bull., 1945, 
42, 659-662. 

HarrowEr-Ericxson, M. R. and Srerner, M. E. Large scale Rorschach techniques. Springfield, 
Ill.: Chas. C. Thomas, 1945. 

Hoyt, C. Test reliability obtained by analysis of variance. Psychometrika, 1941, 6, 153-160. 
Munroe, R. L. The inspection technique; a method of rapid evaluation of the Rorschach pro- 
tocol. Rorschach Res. Exch., 1944, 8, 46-70. 


Srermer, M. E. The use of the Rorschach method in industry. Rorschach Res. Exch., 1947, 11, 
46-52. 


COMPARISON OF THE GRASSI BLOCK SUBSTITUTION TEST WITH 
THE WECHSLER-BELLEVUE IN THE DIAGNOSIS OF ORGANIC 
BRAIN DAMAGE! 


JAMES E. PTACEK AND FLORENCE M. YounG 


Mississippi State Hospital, Whitfield, Mississippi University of Georgia 


PROBLEM 


The present study reports the results from the use of a relatively new test for 
organically involved cases, the Grassi Block Substitution Test (GBST) © ©, and the 
Wechsler-Bellevue Intelligence Scale (Form I), as instruments for the diagnosis of 
organic brain damage. The findings from the use of the GBST are compared with 
those based upon Wechsler’s signs for organicity® »- %) and Wechsler’s Mental 
Deterioration Index (W-B MDI) ®: P- 5#89), 


SUBJECTS 


The organic group was composed of 24 subjects ranging in age from 15 through 
43 years of age. The educational average was eleven years of schooling. Fifteen 
cases were U. 8. Marines referred from surgical and neurological wards at the U. 8S. 
Naval Hospital, Camp Lejeune, North Carolina. Six cases were mental patients in 
state hospitals, and three were referred by a clinic or a physician. All patients were 
medically and clinically diagnosed as having organic brain damage and all had 
undergone complete neurological workups. Nineteen were victims of accidents. The 
diagnostic classifications were: brain concussion, 6; skull-fracture, posttraumatic 
encephalopathy, 5; chronic brain syndrome associated with brain trauma, 3; brain 


1This paper is abstracted from a M. S. thesis submitted to the Graduate School of The University 
of Georgia by Mr. Ptacek. It was prepared for publication by the senior author. 
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(MC) U.S. N., Executive Officer; Lt. W. E. Schumacher (MC) U.S. N., Chief of Neuropsychiatry; 
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contusion, 2; brain compression, with meningeal hemorrhages, 1; chronic brain syn- 
drome associated with brain trauma, 1; skull fracture, concussion with subdural 
hematoma, 1; foreign body, right frontal lobe, 1; mental disorder, non-psychotic, 
with associated structural changes (post-traumatic) 1; psychosis, post-traumatic, 1; 
multiform glioblastoma, 1; cerebral palsy, 1 

The normal group was composed of 30 U. S. Marines in convalescence on medi- 
cal and surgical wards at the U. S. Naval Hospital, Camp Lejeune, North Carolina. 
These subjects were referred by medical officers and gave no history of brain injury. 
The age range of the normal group was from 18 through 37 years of age, and the 
average education was 11.2 years of schooling. 

For organic cases, the average time required for the Wechsler-Bellevue was two 
to two and one-half hours; for normals, one to one and one-half hours per subject. 
For the Grassi test, the average time for organics was from one to one and one-half 
hours; for normals, twenty to forty minutes per person. ‘ 


RESULTS 


The range of scores obtained on the GBST by the organic group was 4.5—24 
with a mean score of 13.9, as compared with Grassi’s range of 2.5—16 and a mean 
score of 10.2. It was not possible to compute any measure of reliability or signifi- 
cance of difference since Grassi does not furnish any more information than that 
presented above. The difference in means may be accounted for by the fact that 
the present study dealt with a more select group of organic patients than did Grassi. 
‘6, p. 5) The range of scores obtained in the GBST by the normal group was 17— 
26.5 with a mean score of 22.5. These data closely resemble those of Grassi. A com- 
parison of the organic group mean of 13.92 (S. D. of 5.95), and the normal group 
mean of 22.50 (S. D. of 4.68), obtained on the GBST in this study yields a C.R. of 
5.77, which is significant at the .01 level. 

On the Wechsler-Bellevue from which eleven subtests were given, comparisons 
of IQ’s of organics and normals result in C. R.’s which are significant at the .01 
level. (Table 1). The GBST classified only 25% of the organic group as having no 


TaBLe 1. Means, StanpARD DeEviATIONS, DIFFERENCES IN Means, CriticaL Ratios, LEVELS OF 
SIGNIFICANCE FOR VERBAL I.Q.’s, PERFORMANCE I.Q.’s, AND FuLt ScaeE I.Q.’s OBTAINED BY 
ORGANIC AND NorMAL Groups ON THE WECHSLER-BELLEVUE INTELLIGENCE SCALE 
(Organics N-24) (Normats N-30) 








1.Q. Scale 


Normals 


Organics 
An. Ss. 


Mn. 


wD. . 


Difference in 
Means 


C.R. (1)* 





Verbal 
Performance 


Full Scale 





80.50 
2.29 
74.75 


15.66 
20.93 
45.36 


102.67 
111.67 
107.50 


11.85 
10.15 
12.00 


22.17 
39.38 
32.75 


5.65** 
8.30** 
4.28** 














(1)*In this and later tables, denotes significance above the 5 % level of confidence. 
**In this and later tables, denotes significance above the 1% level of confidence. 


deterioration, while the W-B MDI put 50% in this category, a difference which is 
significant at the .05 level. (Table 2) 

Although all differences are not statistically significant, it appears that the 
GBST is the more accurate of the two indicators, especially with reference to the 
severe cases. It should be noted that the GBST is not a perfect instrument, in that 
ro Sora mad classified 30% of the normals as having moderate mental deterioration. 

able 

The inadequacy of the W-B MDI has been noted by other investigators, for 
instance, Allen “: 2), Blake and McCarthy“, Kass), and Gutman). 

Using Grassi’s critical score of 16“: »- *) to classify severe and definite mental 
deterioration due to organic brain damage, 83% of the entire group of 54 subjects 
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TasBLE 2. PeRcENTAGE oF OrcaNnic Group AND NorMAL Grovp As CLASSIFIED BY THE GPassi TEST 
AND THE WECHSLER-BELLEVUE MENTAL DETERIORATION INDEX, DIFFERENCES IN PER CENTS, 
o DIFFERENCES IN PER CENTS, AND CriticaL Ratios (OrGanics N-24; Normats N-30) 








% W-B MDI % Diff. o Diff. C.R. 
Grassi of Ss of Ss in %’s in %’s 





Organics 
Severe Definite 33 
Moderate Possible 17 
None None 50 





Severe Definite 10 10 
Moderate 30 Possible 20 10 
None 70 None 70 0 








Notes: Reliability coefficients: For organics, .43; for normals, —.089 


were classified correctly and 17% were misclassified. All 30 of the normals were cor- 
rectly classified, while 9 of the 24 organics were misclassified. 

In applying Wechsler’s clinical signs for organicity ®: »- 5°) to the entire group 
of 54 subjects, 74% were correctly classified and 26°% were misclassified. Of the 30 
normals, 4 were misclassified, while 10 of the organics were misclassified. A critical 
point of four clinical signs was obtained by graphic methods, overlapping distribu- 
tion curves. 

A comparison of the two percentages, 83% correctly classified by the GBST, 
and 74% correctly classified by the Wechsler clinical signs, resulted in a C.R. of .97 
which is not significant. Although the difference is not statistically significant, it is 
evidence that the GBST, in being 9% more accurate than the Wechsler clinical sign 
method, may be somewhat the more sensitive in detecting severe deterioration due 
to organic brain damage. 

n general, the results of the present study agree with Wechsler’s statements 
that for organics the most adversely affected of the eleven subtests are Digit Symbol, 
Arithmetic, Similarities, Object Assembly, Digit Span and Block Design. 

Item analysis of the results of all GBST designs shows that both groups did 
poorest on step 4, difficult abstract thinking (Table 3). For steps 2 and 3, simple 
abstract thinking and difficult concrete thinking, the records within each group were 
similar. As for Grassi’s “? cases, the subjects in the present study had their best per- 
formance on step 1, simple concrete thinking. 


TaBLeE 3. IremM ANALYSIS OF FAILURES ON THE Grassi Test BY ORGANIC AND NoRMAL SUBJECTS 
(O—-FAILURES OF ORGANICS; N-FAILURES OF NORMALS) 








Step Design 


I Design IT Design III Design IV Design V 
N N 0 N 0 N 0 N 





0 5 0 4 0 5 
1 6 1 10 5 12 
1 7 1 6 0 7 
12 9 16 10 18 
14 36 15 42 


























JAMES E. PTACEK 


A comparison of the results of organics and normals (Table 3) indicates a marked 
difference in the number of failures, respective totals being 178 and 65. All of Grassi’s 
ten behavioral signs ‘ »- ?7) appeared frequently in the organic group. Those oc- 
curring most often were: use of trial and error, inability to shift, need for reassur- 
ance, inability to recall instructions, difficulty in diagonal relationships and inability 
to recognize incorrect designs. Organic cases showed similar behavioral signs while 
taking the Wechsler-Bellevue, especially demonstrating confusion, difficulty in 
verbal expression, inability to understand, and slowness of reaction time. 


SUMMARY 

The Grassi Block Substitution Test and the Wechsler-Bellevue Intelligence 
Scale have been compared as instruments for the diagnosis of organic brain damage. 
The latter test mistakenly classified 50°% of the known organics as having no deter- 
ioration, while the GBST was in error only in 25% of the cases, a difference Which is 
significant at the .05 level. 

The GBST classified no normals as having severe deterioration, but the W-B 
MDI indicated that 10% had definite mental deterioration. The GBST is not a 
perfect test in that it indicated 30° of the normals as having moderate deterioration. 

The GSCT correctly classified 839% of the 54 cases while the Wechsler clinical 
signs for organicity identified only 74° 7, a difference, however, which lacks statistical 
significance. 
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AMMONS AND WECHSLER TEST PERFORMANCES OF COLLEGE AND 
PSYCHIATRIC SUBJECTS* 


ROBERT M. ALLEN AND THOMAS E. THORNTON AND CHARLES A. STENGER 
University of Miami V. A. Hospital, Coral Gables, Fla, 


PROBLEM 

The movement toward rapid screening of intellectual level finds further impetus 
in the Ammons Full-Range Picture Vocabulary Test (Ammons Test)“. This com- 
paratively new test consists of 16 plates with four pictures drawn on each. The 
subject is instructed to point to the one of the four pictures on a given plate that 
best depicts the meaning of a test word. The raw score consists of the total number 
of correct responses for the 16 cards. These raw scores are transmuted into mental 
age equivalents and percentiles. The maximum raw score is 85 points for form A. 


*Published with permission of the Chief Medical Director, Department of Medicine and Surgery, 
V. A., who assumes no responsibility for the opinions expressed or conclusions drawn by the authors. 
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The only published study of the performance of an adult population on the 
Ammons Test is the original adult standardization investigation by Ammons, Lar- 
son, and Shearn®?. With a sampling of 120 adults in varying employment status, 
and presumably a non-psychiatrically involved population, these authors obtained a 
correlation of .85 between Ammons Test, form A, raw scores and Wechsler Bellevue 
Adult Intelligence Scale Vocabulary (W-V) subtest raw scores. They concluded: 
“In view of the short. administration time, intrinsic interest value for adults, and 
excellent reliability and validity, the test should prove to be highly useful for such 
purposes as estimating the intellectual capacity of verbally handicapped adults and 
rapid screening where maximum efficiency is desired’ ®: »- 5), One caution for its 
use is contained in this statement: ‘The present items are not hard enough ade- 
quately to test the upper ranges of adult ability, those beyond IQ 125 on the Wechs- 
ler’’ @. P. 154) 

The present study was undertaken to test three hypotheses: 

Statement I: Ammons Test performance is significantly related to intellectual 
status as measured by the full scale Wechsler IQ (W-IQ). (This is an extension of 
the problem investigated by Ammons, et al.) 

Statement II: The Ammons Test is as efficient as the W-V subtest for differ- 
entiating levels of intellectual ability as measured by the W-IQ. (This flows from 
the conclusions of Ammons, Larson, and Shearn®? and represents an effort to sup- 
port their findings.) 

Statement III: The Ammons Test is equally efficient as a screening tool for 
normal, college, and psychiatric populations. (The Ammons Test is as resistive to 
the inroads of a pathologic process as the W-V subtest.) 


PROCEDURE 
Two groups of subjects were administered the Ammons Test and the 11 sub- 


tests of the Wechsler Adult Intelligence Scale!. Table 1 presents the vital statistics 
for the 100 subjects in the present study. The college population consisted of 49 


TABLE 1. Personat Data, AMMONS AND WECHSLER Test MEAN Raw Scores, AND CORRELATIONS 
BETWEEN AmMMoNS Raw Scores AND WECHSLER SCORES 








Our Groups 
Ammons et. al. © 
Factors College Psychiatric Combined Adult Group 





N 51 100 120 


Mean Age ; 34 27.9 24.9* 
Age Range 19 - 65 17 - 65 18 - 34 


Mean Raw Scores : 
Ammons + 5.2 69+ 6.7 72+ 9.7 69+ 9.4 


W-B Vocabulary + .8 24+ 7.0 27+ 6.3 23.4+ 5.8 
W-B IQ * 7%, 1064+ 15.5 11514 4.7 104** 








Correlations r 


Ammons and W-B 
Vocab. A .87 .87 


Ammons and W-B IQ 46 . 86 81 














*Estimated from averages of Ammons, Larson and Shearn @) population. 
**Estimated IQ by Ammons, Larson and Shearn ®: P- 158), 


1In order to maintain a constant criterion for the total intellectual level as measured by the 
Wechsler Full Scale IQ it was decided to use all 11 subtests as usually given in a clinical situation. 





380 ROBERT M. ALLEN, THOMAS E. THORNTON AND CHARLES A. STENGER 


undergraduate student volunteers at the University of Miami. None indicated a 
history of personal maladjustment of a disabling nature. The 51 patients in the 
psychiatric population were drawn from the Neuropsychiatric Service of the V. A. 
Hospital at Coral Gables, Fla.? 


RESULTS AND DIscUssION 


Table 1 discloses the mean Ammons and Wechsler findings. The college group 
achieved significantly higher Ammons and Wechsler scores than the psychiatric 
population. This is to be expected in view of the selectivity operating in the college 
group. The psychiatric population obtained scores more closely resembling the 
adult standardization group of Ammons, Larson, and Shearn®?; see Table 1, third 
and fifth columns. 

The relationships among Ammons Test scores and the two Wechsler scores 
(V subtest raw score and IQ) may be seen in Table 1. All correlations are significant- 
ly different from zero. The difference between the correlation coefficients for the 
Ammons Test and W-V raw scores for both groups is statistically significant beyond 
the 1 per cent level of confidence. The same level of confidence obtains for the differ-' 
ence between the correlations of the Ammons Test raw scores and the W-1Q’s for 
both groups. 

The data indicate: (a) that college students fulfill the expectation of higher 
achievement on the Ammons and Wechsler Tests than the psychiatric group; (b) 
that the more heterogeneous psychiatric population does as well as the adult stand- 
ardization group in regard to Ammons Test raw scores, W-V subtest raw scores, and 
W-1Q; (c) that for all populations, individually and combined, there is a significant 
correlation between Ammons Test and Wechsler (V and IQ) attainment. 

With regard to Statement I, the present study finds sufficient evidence to accept 
the hypothesis. The Ammons Test is an adequate screening device, subject to the 
following considerations: 

1. The Ammons Test is far more efficient for screening subjects in the average range of 
intelligence. This may be seen in the significantly higher correlation coefficient for the psychiatric 
up as compared with the college group—.86 and .46 respectively. This is further supported 

y the .85 Ammons Test and W-V correlation for the adult standardization group. The psychia- 

tric and adult groups are alike in that they are more heterogeneous than the college population 

used in this study. The former two groups showed a wider distribution of Ammons and Wechsler 
scores which consequently resulted in a higher correlation. The college population tended to pile 
up Ammons Test raw scores at the higher end of the scale (note the college students’ mean raw 
scores: Ammons Test, 76 +5.2 of a possible maximum of 85 points and W-V mean of 29 + 3.8 
of a possible maximum of 42 points). This greater accumulation of Ammons Test scores at the 
upper end of the scale militated against a higher correlation. This supports Ammons’, et al., 
— regarding the use of this test for screening the more intelligent subjects. 


The Ammons Test correlates sufficiently well with the W-V raw score and the W-IQ to 
warrant its use as a screening device for intellectual level and word usage. 


Statement II is supported by the evidence of this study and Ammons’ original 
contribution. This is especially applicable for the middle range of intelligence and in 
a non-selected population. However, its use with the college group is not contrain- 
dicated, only cautioned against. 

Statement III is not totally acceptable. In view of the present findings and 
those of Ammons, Larson, and Shearn®?, it must be inferred that educational ex- 
perience and college selectivity do influence Wechsler and Ammons scores. The 
psychiatric group achieved at almost the same level as the adult standardization 
group (see the third and fifth columns of Table 1) and both performed significantly 
below the college group in all three areas of measurement. It would seem that the 
psychiatric process did not depress both vocabulary test scores since the psychiatric 
group fared as well as the adult standardization population in all three areas. 

The Ammons Test is not equally efficient for the college and the psychiatric 
groups. It is less efficient as a screening test for the college student. 


*The psychiatric group included patients classified as neurotic and “controlled” psychotic. The 
prime requisite for inclusion in this study was testability. 
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SUMMARY AND CONCLUSIONS 


Three statements were made with regard to the efficiency of the Ammons Full- 
Range Picture Vocabulary Test, form A, as a screening device, using the Wechsler 
Bellevue Adult Intelligence Scale as the criterion of intellectual level. From the 
evidence in this study and that of Ammons, Larson, and Shearn ®) it was found that 
Statements I and II may be wholly accepted, while Statement III is only partially 
acceptable. The Ammons Test may be used as a short screening device for subjects 
in the average intelligence range, whether or not they fall within a psychiatrically 
defined population. 
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A STUDY OF THE RELIABILITY OF AN ALTERNATE FORM FOR THE 
SHIPLEY-HARTFORD ABSTRACTION SCALE* 


REUBEN 8. HORLICK AND HAROLD J. MONROE 


Audiology and Speech Correction Center 
Walter Reed Army Medical Center 


Washington, D. C 


PROBLEM 


The purpose of this paper is to report on an alternate form of the Shipley- 
Hartford Scale® which, it is hoped, will measure reliably the same abilities as does 
the original form, thus providing the clinician with another useful, comparative tool 
with which to work. 

The Shipley-Hartford Scale has been found useful for obtaining a quick estimate 
of present intellectual functioning, and for detecting impairment in mental efficiency. 
The total scale consists of a vocabulary test of forty items and an abstraction test of 
twenty items, each having a ten minute time limit. The vocabulary score is merely 
the number of correct responses for the forty items plus one point for each four items 
not tried. The abstraction score is the number of correct responses for the twenty 
items multiplied by two in order to make it comparable to the vocabulary score. 
Separate scores are obtained for each scale and the combined scores of both scales 
are converted into a total mental age which then can be compared with mental ages 
or intelligence quotients derived from other tests. Shipley? reports reliability co- 
efficients for Army recruits of .87 for the vocabulary test, .89 for the abstraction test 
and .92 for both tests combined. Manson and Grayson? in a study of 300 randomly 
selected prisoners in an overseas disciplinary training center report re-test reliability 
coefficients for the abstraction scale of .84. 

The abstract thinking test requires that the subject, through inductive and de- 
ductive reasoning, arrive at a specific answer for each of the twenty items in the 
alloted time period. Since the effect of, practice on this test may play a role in re- 
peated testings of the same subjects as, for example, in follow-up or progress report 
studies, an alternate form (Form IT) shown in Fig. 1 has been devised. Each item of 
the alternate form is an adaptation of a corresponding item on the original. 


*We wish to express our appreciation to Dr. Clare Cornell, formerly with the Audiology and 
Speech Correction Center, who developed the alternate form and made it available for our use, and 
to Dr. Aram Glorig, Director, for his encouragement of this study. 
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Ficure 1. ALTERNATE Form OF THE SHIPLEY-HARTFORD ABSTRACTION SCALE 








Complete the following. Each dash (—) calls for either a number or a letter to be filled in. Every 
line is a separate item. Take the items in order, but don’t spend too much time on any one. 


START HERE 


45678 — 
hot cold big little young — — — 
BC CD DE E— 


GFEDCB— 

23432 34543 45654 567— — 

SE/NE NW/SE N/S E/— 

relate elate late — — — 

on no but tub moor — — — — 
WFXEYODZ-— 

bob bob bard drab 725 

hang an rash as hint in 

47915 79154 91547 

grow or star at lost so 

snowstorm stormdoor 3 

republic 12345678 cruel 81426 price 

has hat pie pig not now hen — — — 
dish vessel ship faucet tap dance circle ring sound fasten 
432 63 72 153 61 — 

bad bed pen pin hit hot pot — — — 
four r two w three r one — 











PROCEDURE AND RESULTS 


Both forms, the original and the alternate, were administered, in accordance 
with Shipley’s directions, to two groups of Army male recruits. Group A was com- 
posed of 112 men and Group B of 101 men. Group A was given the alternate form 
(Form II) of the Shipley-Hartford Scale first and the original (Form I) administered 
immediately afterward. The same procedure was followed for Group B with one 
exception, Form I was administered first immediately followed by the alternate 
form. In addition, each group was administered the Cornell Index Questionnaire 
and the Psychopathic Deviate Scale of the Minnesota Multiphasic Personality In- 
ventory. The data are reported in Table 1. 


TaBLe 1. Data ror Groups A AND B By MEAN Scorks ror Aas, EpucaTion, 
AND Test PERFORMANCE 








FAcToRS Group A| GrourB 


Chronological Age 19.9 
Education 11.4 
Cornell Index Score 10.2 
Shipley-Hartford Vocabulary 23.6 
Shipley-Hartford Abstraction I 25.6 
Shipley-Hartford Abstraction II 22.3* 
Abstraction Difference I-II 3.3 

















*Administered first in each group. 


Comparison of mean scores for both groups in Table 1 indicate that they are 
fairly similar. It will be noted that Group A is slightly older, has slightly more educa- 
tion and shows less evidence of anxiety! as reported by mean Cornell Index scores.” 


1For all practical purposes we prefer to use the term anxiety rather than try to ascertain which 
—— difficulties or symptoms are involved, for the Index does not definitely differentiate these 
ifficulties, but rather presents a pattern which would probably be better associated with an anxiety 

picture. 
*In this study we used a critical “score of 13”, which, according to the Index, “screens the major- 


ity of those persons with serious neuropsychiatric or psychosomatic disturbances and a moderate 
number of ostensibly healthy persons.” 
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Group B is the slightly brighter group as shown by the scores on the Shipley-Hart- 
ford tests, particularly by comparison of mean vocabulary scores. These differences, 
however, are not significant. It will also be seen that there is about a one item im- 
provement from the first administration to the second. 

It will be observed from Table 2 that the mean differences from test to test are 
significant at the .01 level for Group A and at the .02 level for Group B. This indi- 
cates that the change is probably not due to chance occurrence. 


TABLE 2. SIGNIFICANCE OF MEAN DIFFERENCES IN PERFORMANCES OF EacH GROUP AND THE 
COEFFICIENT OF RELIABILITY BETWEEN THE Two ABSTRACTION TESTS 








Abstraction I Abstraction IT Critical Ratio 





25.6 22.3 
8. D. 8.2 8.0 


Group B 
Mean 25.5 28.0 


8. D. 8.1 7.0 























* Difference is significant at the .01 level of confidence. 
** Difference is significant at the .02 level of confidence. 


The product moment correlation between Form I and Form 2 in Group A is 
.79, and in Group B is .80. Such correlations are considered fairly high and suggest 
that the alternate form may be safely substituted for the original form whenever the 
need arises. It is evident also that there is approximately the same degree of varia- 


bility between the two tests and the order of presentation of the two forms does not 
significantly alter the test score. There is only a slight improvement in test results 
on a subsequent administration. This improvement may be due in part to lack of 
temporal separation of the administrations. It is just as likely, though, that less 
tangible factors associated with the learning processes also contribute to make for 
increased performance on the second administration. Our general experience in 
clinical practice in testing individuals at intervals which may vary from one day to 
periods of three months or more is that the amount of improvement shown is not 
significant, and that test results may be accepted as reflecting the patient’s perform- 
ance at the time of examination. 


SUMMARY AND CONCLUSIONS 


An alternate form of the Shipley-Hartford Abstraction Scale has been devel- 
oped with a preliminary report on data obtained from two groups of 112 and 101 
army male recruits. Both groups were comparable for intelligence and other factors. 
The correlation between alternate and original forms in Group A was .79 and in 
Group B .80. There was a significant difference at the .01 level of confidence for 
Group A and at the .02 level of confidence in Group B. Only slight practice effect was 
experienced on subsequent testing. 
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A MODIFICATION OF THE INDEX OF PROFILE SIMILARITY 
J. M. DU TOIT 


National Bureau of Educational and Social Research 
Pretoria, South Africa 


One of the quickest and easiest—though probably not the most accurate or 
valid—methods of comparing profiles of scores on subtests is that devised by Du 
Mas: »), While Du Mas takes elevation and scatter also into account™?, the 
present note concerns only the similarity in the shapes of profiles, and the quantita- 
tive index of profile similarity, rp., suggested by him. 

In the main Du Mas’ method amounts to the comparison of the direction of 
slope of the segments of the two profile graphs, and the index is defined as 


fs 
Tp = 2 \T - .5 
where S is the number of segments with similar slope, and T the total number of 
segments in each graph. When any segment of either profile is horizontal, a chance 
a is to be made to determine whether to regard the siope as upward or down- 
ward ®?, 

It is clear that the value of r,s, obtained depends to a large extent on the chance 
juxtaposition of subtests, and since there is no unique order of arranging such sub- 
tests, there is no unique value of r,s; that can be found in this way. To illustrate we 
may consider the profiles recently used by Du Mas by way of example“. In the 
case as shown in Du Mas’ fig. 1, the value of rp, amounts to .111. ‘If the order of 
the variables had, however, happened to be Ti, Ts, Tz, Ts, Ts, Ts, Ts, Ts, Tz, Tio, a 
value of .554 would have been obtained for r,s. Similarly almost any other arrange- 
ment would yield a different value for rps. This is merely what is to be expected, as 
a sample of only 9 out of the 45 relationships actually existing between the 10 var- 
iables is being taken into consideration in each case. 


Clearly an index whose value depends to such an extent on an arbitrarily de- 
termined order of recording of the several variables cannot be considered very valid. 
If this index is to be at all useful, even as a rough measure for clinical use only, it is 
evidently imperative that stability should be attained and the value arrived at 
should be a function of the test results and the constant relationships between all of 
them and not of method or order of recording. Another disadvantage of Du Mas’ 
method is the high values required for significance. In few cases of profile comparison 
do we deal with more than ten subscores, which imply nine segments only. In such 


a case the value for significance at the 5 percent level, according to DuMas’ table®?, 
is .78. 


In order to obtain a stable value, independent of the order of recording, and 

also to lower the significance limits by increasing the number of segments concerned, 
the following modified procedure is suggested: Instead of considering the slopes of 
segments between adjacent scores only, the gradients between every score and 
every other score of each profile should be utilized. In the example referred to above, 
the directions of slope would be as presented in abbreviated form in Table 1. 
The gradients between all possible pairs of scores for the two profiles having been 
determined, they are compared and the rps calculated as before. The value so ob- 
tained is in no way dependent on chance juxtaposition of scores and therefore, as 
far as this factor is concerned, completely stable. The number of segments consid- 
ered are also increased from 9 to 45, for which the values for significance-at the 5% 
and 1% levels are .295 and .385 respectively. 

It might be objected that the brevity of the method is forfeited by this some- 
what more elaborate treatment. It is, however, contended here that the five minutes 
or so of extra time and effort required are more than justified by the advantages to 
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TABLE 1. DrrRecTION oF GRADIENTS BETWEEN ALL MEASURES (ABBREVIATED) 








Direction of slope Comparison 
Number Measures concerned Profile I Profile IT of slopes. 





T; and T, positive negative different 

T: and T; positive positive similar 

Ti and T, positive positive similar 

T; and T; positive negative different 

Ti and Tio positive positive similar 

T: and Ts; negative positive different 

Tz and T, zero positive to be determined by 
chance 


T: and Tio negative positive different 
Ts and T, positive positive similar 
T; and T; positive negative different 


Ty and Tio negative positive different 








be gained in this way. The method still remains very quick and the calculations 
very simple. 

As a functional test the method here suggested was tried out in the following 
case: 
The profiles formed by the average scores obtained by four groups on an adjust- 
ment inventory yielding ten subscores were compared by means of rank correlation 
coefficients as well as of indices of profile similarity, calculated as described above.' 
The original method had yielded no significant index so that further treatment was 
not justified on that basis. The ‘modified’ indices of profile similarity and co- 
efficients of rank correlation are presented in Table 2. 


TABLE 2. MEASURES OF AGREEMENT OF PROFILE PATTERNS 








Groups Indices of profile Coefficients of rank 
similarity correlation 


I and IT . 288 
I and III -.349 
I and IV -.773 
II and III —.689 
II and IV -.111 
IIT and IV .156 

















These values were then factor-analyzed by Burt’s method and the results are given 
in Table 3. 


TaBLeE 3. Factor SATURATIONS FOR ADJUSTMENT INVENTGRY 








Analysis of rank correlation 
Group Analysis of rps coefficients 





Factor I Factor IT Factor I Factor II 
I . 8827 .4469 .7958 .4189 
II . 7963 —.5502 .6439 —.5259 
Ill —.6932 . 5290 —.6820 -4785 
IV -.7312 —.6548 -.6554 —.5920 























1The nature of the inventory and relevant data for the groups used are not described here as 
the full research will be reported in a future publication. The results are here used for illustrative pur- 
poses only. 
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The agreement between the results obtained by the two methods is evident, but 
the rps has the advantages of being more easily calculable and having lower sig- 
nificance limits. 

Perhaps the crucial test of the validity of our ‘‘modified”’ index of profile simi- 
larity is that of the actual significance of results obtained by its use, as distinct from 
statistical significance. In the present case Groups I and II consisted of boys, III 
and IV of girls; I and III belonged to one cultural and language group, II and IV 
to another, differing in fundamental respects from the former. It is very clear that 
Factor I reflects sexual differences, while Factor II concerns cultural differences. 

It may be concluded that, at least in this instance, results yielded by an analysis 
of indices of profile similarity obtained by the method here suggested, appeared to 
be in full accordance with actual, significant differences between the experimental 
groups. In this case the new indices were therefore found to be of sufficient signifi- 
cance and validity to lend themselves to further analysis, yielding valid results in 
agreement with known factual distinctions. 


SUMMARY 


This paper points out certain limitations of Du Mas’ index of profile similarity 
and suggests a modification for the derivation of a unique and stable index, inde- 
pendent of the order in which the experimental measures forming the profile happen 
to be arranged. An example of the application of this modified index in the case of 
an adjustment inventory was described. Analysis of results indicated that these 
were in accordance with rational considerations, thus rendering functional evidence 
for the validity of the procedure. 
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PROBLEM 


Experience with the Minnesota Multiphasic Personality Inventory (MMPI) 
indicates the occasional need for a short form of this questionnaire when testing time 
is limited. Previous studies“: *: *. 5) have sought to remove the unscored, or “‘sleeper’’ 
items, regardless of item position within the test. The results of this research have 
shown generally poor reliability between the standard and short forms. In addition, 
most MMPI users agree that the number of items should not be reduced because of 
the detriment to future research with this instrument, and many items not scored 
on the original scales are now scored for scales recently developed. It is recognized 
that wholesale application of a short form would be imprudent; however, a more valid 
abbreviated form should be available when the need for such is encountered. 

Perusal of the scoring keys for the group form reveals that only 22 items are 
scored beyond item #420. Two of these are scored for K and 20 are scored for Si. It 
is apparent that if the test is stopped at item #420, very little information can be 
lost which contributes to the original nine clinical scales and the four validity scales, 
although cutting the test at this point results in saving 26% of the testing time. 
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This is accomplished without disturbing the original item arrangement with the only 
variation being in the difference in magnitude of the task the subject need complete. 
The problem considered here is the validity of proration or other methods of extra- 
polation of the scores obtained on the first 420 items for K and Si. 


METHOD 


Since item arrangement apparently does not contribute to low reliability be- 
tween the long and short forms, the validity of extrapolation for the Si and K scales 
can be accomplished without the actual administration of the short form, on the 
assumption that no difference in responses to items # 1 - #420 will occur as the re- 
sult of elimination of the subsequent 146 items. 

From the files of the Hastings State Hospital were drawn the first fifty group 
form MMPI profiles alphabetically encountered. From the psychological records of 
employees from the same hospital, 21 profiles of psychiatric aides were extracted. 
The profiles of 14 business executives were added to the aides’ group to comprise a 
total ‘“‘normal’’ group of 35 individuals. From each of these 85 profiles were obtained 
the scores for Si and K. Each profile was also scored for the number of significant 
items beyond #420 which contributed to the total scores on K and Si. The latter 
score was subtracted from the total score and the result considered to be the score 
which would have been obtained on these scales on the short form. A table of pro- 
ration was computed for Si, from which was derived a predicted score for the 20 
items eliminated on the short form. This predicted score was compared with the 
actual score obtained on these 20 items, and the difference between predicted and 
obtained scores was noted. The table for proration of the Si score is shown in Table 1. 


TaBLE 1. PRORATION FOR THE CORRECTION OF S1 Raw Scores 








Short form score Add Short form score 
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Proration of the 28 scored K items did not promise to be the most effective 
method of extrapolation for this scale. Thus, various points were selected at which 
one point would be added to the raw K score obtained on the short form. The most 
promising K correction appeared to be the addition of one point to the raw K score 
when that score equaled or exceeded 12 on the short form. This is the correction 
that was submitted to cross-validation on a group of 72 individuals, 35 of whom 
were hospitalized and 37 of whom were non-hospitalized. 


RESULTS 


The mean K scale T scores for the hospitalized and non-hospitalized groups 
were 56 and 59, respectively. The-mean T stores for Si were 56 and 47. There were 
no significant differences between scores obtained for the aides’ group and the 
executives’ group, and they are, therefore, considered to be homogeneous with res- 
pect to their behavior on these two scales. The predicted scores were compared with 
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the actual scores and the results tabulated separately for the hospitalized and the 
non-hospitalized groups. 

Proration of K tended to underestimate the actual K score, particularly in 
the hospitalized group. Cross-validation of the experimentally determined K cor- 
rection, i.e., the addition of one point when the raw K score equaled or exceeded 
12 on the short form, resulted in prediction of the K score within one point for 94% 
of the hospitalized and 100% of the non-hospitalized groups. For the total cross- 
validation sample of 72 persons, the predicted score was within one raw score point 
of the actual score in 97% of the cases while 48°% of the entire group showed no 
difference between predicted and actual score on the K scale. 

Proration of the Si score resulted in 95% of the predicted scores for the entire 
sample of 85 persons falling within four raw score points of the actual score. A slight 
tendency for proration to over-correct for the hospitalized group was noted. The 
results found in this group held sufficient promise to warrant extension to the sample 
of 72 persons on which the K correction was cross-validated. The results of this ex- 
tension corresponded closely to those found in the original sample; i.e., 94% of the 
hospitalized group had a predicted score within five raw score units of the actual 
score, while 100% of the non-hospitalized group fell within the same range. When 
the data for this group were combined with those of the original group of 85 sub- 
jects, 97% of the total predicted scores were within five points of the actual scores. 
The distribution of differences of actual and predicted Si scores for the entire non- 
hospitalized group, N = 72, has a mean of -.17 and a standard deviation of 2.35. 
Similar figures for the entire hospitalized sample, N = 85, are +3.6 and 2.82. The 
mean for the total group of 157 individuals is +.12 and the standard deviation is 
2.62. Computation of the formula for skewness produges a result of +.074, most of 
which is attributable-to the hospitalized group. 


SUMMARY AND CONCLUSIONS 


An abbreviated form of the group MMPI, using only the first 420 items, is sug- 
gested for use when the need for such an instrument arises. The suggested short 
form eliminates only two K items and 20 Si items, although it results in a saving of 
26% of testing time. A correction for the K scale was found to be accurate within 
one raw score point in 97% of the cross-validation group. The proposed correction 
for the Si scale predicted the actual score within five points in 97% of the entire 
sample of 157 hospitalized and non-hospitalized persons. The results indicate rather 
clearly that very little change in absolute score, or in the configuration of the profile 
as‘a whole, will be produced by the use of this procedure. It is not improbable that 
the differences between predicted and actual scores shown here are only slightly 
larger than those which would be found on a test-retest study. In view of the re- 
sults of earlier studies, it is suggested that this is the most reliable and valid abbre- 
viated form of the group MMPI at the present time. 
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SCHIZOPHRENIC IDEATION AS STRIVING TOWARD THE SOLUTION 
OF CONFLICT 


ANTON BOISEN "| RICHARD L. JENKINS 
Elgin State Hospital Veterans Administration Central Office 
MAURICE LORR 


Veterans Administration Veterans Benefits Office 


PROBLEM AND METHOD 


If, as seems probable, schizophrenia may best be understood as a functional 
breakdown of the adaptive process related to intolerable inner conflict, then a careful 
study of the ideas presented by acute schizophrenics might be expected to shed some 
light on the nature of those conflicts, and on the means by which the patient deals 
with them. The publication by the senior author“ of a table of the relations be- 
tween ideas expressed by 47 acute schizophrenics with discussion of two clusters of 
related ideas apparent in the table led to the present factor analysis. 

This article presents a factorial study of the ideation of a series of 78 male 
psychotic patients of non-organic type, most of them schizophrenic, who were re- 
ceived at Elgin State Hospital during the summer of 1949. Thirty-one of these 
patients were classified as showing acceptance of defeat, through self-deception, 
escape into phantasy or escape through alcohol or drugs, and these subjects were 
not included in this study. 

Forty-seven were classified as cases showing acute anxiety and the following 
analysis is based on only these 47 cases. Tetrachoric correlation coefficients were 
used, and it was necessary in a number of instances to approximate the value of the 
coefficient when one of the corners of the fourfold table had an entry of zero. How- 


ever, the consistency of the results leads us to believe that the analysis has consider- 
able validity. 


RESULTS 
A centroid analysis reveals two factors. These were rotated to the oblique solu- 
tion presented in Table 1. Entries with heavy loadings on the first factor and neg- 
ligible loadings on the second factor are mystical identification, ideas of death by 
divine decree, sense of mission and expectation of world disaster. This factor we will 
call seeking solution through religious surrender. 


TABLE 1. CentTRomp Matrix F anp OsiLiquE Matrix V 








F 
Nature of Ideation 





Mystical identification 

Ideas of death by divine decree 
Sense of mission 

Ideas of word disaster 

Ideas of reincarnation 

Ideas of change of sex 
Obsessive sexuality 
Accentuated religious interest 


Ideas of death at the hands of enemies ; 

Externalized conscience f ‘ -05 
Transfer of blame ; -16 
Ideas of suicide -55 ; 49 








Columns I and II represent the loadings in the orthogonal centroid factors. h* is the common 
factor variance. 

Columns A and B represent the oblique factors obtained by rotation. The cosine of the angle be- 
tween the normals A and B is .54. This implies an angle of 144° between the planes and a substantial 
negative correlation between them. 
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The second factor is clearly one of paranoid projection. The heaviest loadings 
appear in ideas of death at the hands of enemies, and in what the senior author has 
described as externalized conscience. Thislatter entry might broadly be paraphrased 
as derogatory ideas of reference or of influence. The third highest loading appears 
in transfer of blame. Under this heading were grouped ideas of electrical currents 
shooting through the body usually under the control of some enemy and playing 
upon the genitals, hypnotic or other personal or supra-personal influences and cir- 
cumvention of one’s plans through individuals or organized groups, usually secret 
societies. We will call this factor seeking solution through paranoid projection. 

It is noteworthy that the lines of these two factors meet at an obtuse angle 
(144°) indicating a substantial negative correlation between them. The individuals 
who seek one solution tend selectively to be individuals who do not turn toward the 
other solution. 

There are three entries with positive loadings on the religious factor and smaller 
but also positive loadings on the paranoid factor. They are ideas of reincarnation, 
ideas of change of sex, and obsessive sexuality. 

Accentuated religious interest has, as we might expect, a strong positive asso- 
ciation with seeking solution through religious surrender and a moderately strong 
negative association with seeking solution through paranoid projection. 

It is of particular interest that the twelve individuals who attempted suicide 
tended to be those not progressing toward a solution of their conflicts by ezther relig- 
ious surrender or paranoid projection, for the idea of death by suicide has substantial 
negative loadings on both these factors. 


DIscuUsSsION 


This study would indicate tendencies among these 47 schizophrenic patients 


with acute anxiety to progress toward one or the other of two solutions for their 
intolerable conflict. 


One pathway is that of seeking solution through religious surrender. Those who 
appear to be moving in this direction have a favorable prognosis. Only one out of 
eight patients reported as showing a reaction pattern characterized mystical identi- 
fication was unimproved after two years“! Table 1), Only one of ten patients showing 


ideas of death by divine decree was unimproved“: Tle 5), Only six, or 21%, of 28 
cases showing accentuated religious interest were unimproved, while 28, or 56%, of 
50 cases showing unaccentuated religious interest were unimproved“: Table 3) _ 

The other solution is that of paranoid projection, ascription of evil intent to 
others and transfer of blame. Those who appear to be moving in this direction have 
an unfavorable prognosis. Eight of 13 patients expressing ideas of death at the hands 
of enemies were unimproved after two years" Table 5) as were five out of nine pa- 
tients showing externalized conscience“: Table 1), 

It is of interest that those patients who do not achieve any relief of conflict 
either through religious surrender or through paranoid projection are the ones most 
prone to attempt suicide. Yet these patients show only 3 out of 12 unimproved 
(1, Table 5). This suggests a prognosis which, while less favorable than that for those 
achieving some relief of conflict through a religious surrender, is more favorable 
than that for those achieving some relief of conflict through paranoid projection. 

An examination of a similar table prepared on the entire 78 cases, including the 
31 previously classified as showing acceptance of defeat made it evident that the 
result of a factor analysis of this table would be in no way essentially different from 
that here presented. In this undertaking a differentiation was made between two 
elements which have here been grouped under obsessive sexuality. The patients 
who admitted a sexual conflict or problem without surrender to overt and gross 
sexual behavior or the development of grossly distorted and delusional sexual ideas 
were classified as showing an acknowledged sexual problem. Those showing gross 
overt behavior or sexual delusions were classified as having an overt sexual problem. 
Acknowledged sexual problems were definitely assoriated with the seeking of a re- 
ligious solution. The overt sexual problems were not so associated. 
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It is of interest that the two factors revealed may be related to two solutions for 
inner conflict both of which are represented in the religious tradition. One may be 
described as surrender to the will of God and the other as blaming the devil or some 
devil-surrogate. Historically, the latter type of solution has been responsible for 
witch hunts. The present study provides indication that individually as well as 
socially the projection of responsibility proves a method of dealing with conflict 
which reduces tension at the expense of stabilizing morbid and distorted thinking. 


SUMMARY 


Factor analyses of ideas expressed by 47 schizophrenics in acute anxiety reveals 
two factors. One may be described as a factor of seeking solution through religious 
surrender and is associated with a good prognosis for improvement. The other may 
be described as a factor of seeking solution through paranoid projection and is asso- 
ciated with a bad prognosis. 
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A NOTE ON RIGIDITY AND LENGTH OF INSTITUTIONALIZATION 
PAUL SOLOMON 
Myles Standish State School, Taunton, Mass. 


Brand, Benoit and Ornstein“? have recently reinvestigated Kounin’s finding 


that “rigidity is a positive monotonous function of CA”. »-*5), Their findings were 
in substantial agreement with those of Kounin, but they raised the question of 
length of institutionalization as a contributing factor to the rigidity. They point 
out that, for mental defectives, the correlation between CA and length of institution- 
alization is high since they are placed in institutions relatively early in life and in- 
crease in age is accompanied by increased length of institutionalization. Since rigid- 
ity studies are usually carried out with institutionalized defectives, the problem 

as to whether the age factor or the amount of time spent in institutions is of 

importance in the rigid behavior of the defectives. 

\u this study an attempt was made to get at the crux of this problem. Myles 
Standish State School has recently been opened and the admissions come from two 
main sources. Most are transfers from older overcrowded institutions; some are ad- 
missions of defective adults directly from the community. Thus one group has been 
institutionalized for a long time; whereas the others have grown up in family situa- 
tions. By a comparison of these two groups it4s-possible to see what is the influence 
of length of institutionalization upon the rigidity measure. If the groups differ 
greatly on rigidity measures it is likely that time in institutions, with consequent 
lack of environmental stimulation, is an important factor making for rigidity. 


METHOD 


Two groups of 12 individuals each were subjects in this study. One group, 
which we shall call the newly institutionalized (NI), was made up of persons who 
had been admitted from the community less than three weeks prior to the time of 
testing. This group included 7 females and 5 males. The other group, long institu- 
tionalized (LI), was made up of individuals who had been in state schools for at 
least five years. The members of the 2 groups were matched on the basis of sex, 
CA, IQ and diagnosis. The major difference between individuals in the two groups 
was in length of institutionalization. The make up of the groups is described in 
Table 1. 
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TABLE 1. RaNGE AND Mzans or CA, IQ anp LENGTH oF INSTITUTIONALIZATION 
FOR Two DEFECTIVE GROUPS 








Factors Groups 
LI NI 








CA Range 
CA Mean 


IQ Mean 


Length of Institutionalization 


Range 
Mean 





15-34 years 
21.9 years 
41-60 

54 


5-14 years 
8 years 


15-34 years 
21. 7 years 
41-60 

54 


4-21 days 
11 days 








The measure of rigidity was that used by Brand“, which is the Rhythm Tap- 
ping Test devised by Luchins“). The subject is requested to tap with his finger in 
time to a metronome. When he is following the pattern closely, a different rate is 
substituted and he is timed to see how long it takes him to adjust to the new pattern. 
The measure of rigidity is the amount of time taken to adjust. 

The results, in terms of the rigidity measure, showed that the average time to 
make the adjustment for the LI group was 12.8 seconds, for the NI group 11.9 
seconds. The corresponding standard deviations for the two groups were 8.4 and 
10.3. Using small sample statistical method“, we find that there is no significant 
difference between the two groups. Since the distinguishing characteristic of the 
groups is in the amount of time spent in institutions, it appears that this is not a 
major contributing factor to rigidity of behavior. 

In this investigation, our main interest was not the problem of rigidity as in- 
fluenced by age. Werner“ felt that Kounin was incorrect in saying that rigidity 
increased with age. Werner’s view is that stability increases with age and a con- 
fusion of rigidity and stability makes for the opposing conclusions of these two in- 
vestigators. Our groups have an age spread of 19 years, but individuals in each 
group are matched for age. The correlation of age and rigidity for our LI group is .62, 
for our NI group is .57. The slight difference in the correlations is not statistically 
significant. This is corroborative evidence that length of institutionalization is not a 
critical factor in rigidity. 

It seems that longer time in the community actually means greater opportunity 
for stimulation since the extramural environment provides a much greater variety 
of experience than institutional life does. Combs®? has described how the differentia- 
tions a person can make of his phenomenal field will be dependent upon the oppor- 
tunities for perception to which he has been exposed. He defines intelligence as “a 
function of the factors which limit the scope and clarity of an individual’s pheno- 
menal field’’®: »- 62), On the basis of our results it seems that opportunity for per- 
ception is not as important a factor as utilization of that opportunity. Defectives 
act as they do because they have not been able to integrate and utilize the exper- 
iences to which they have been exposed. 

Our sample is admittedly small and only one measure of rigidity was used. 
It seems reasonable to conclude that length of institutionalization is not an im- 
portant factor in functional rigidity of defectives. However, there is likely to be a 
difference in rigidity between institutionalized defectives and persons of similar age 
and intelligence level who have always been able to get along in the community. 
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NOTE ON AN EXPERIMENTAL SCORING SYSTEM FOR THE INSIGET 
TEST 


KATHERINE K. FASSETT 
Wisconsin School for Boys, Waukesha* 


An experimental scoring system for Sargent’s Insight Test was presented by the 
author several years ago“, wherein the subject’s productions were scored as to inter- 
pretation, approach, solution, and effect. Reliabilities as high as .49 being shown, 
the present study was done in order to investigate validity. 

Two groups of 30 subjects each were given the test in the usual written form. 
The experimental group consisted of hospitalized mental patients, with diagnoses in 
neurotic or psychotic categories; the comparison group, applicants for attendant 
positions in a different mental hospital. All such applicants were being routinely 
screened by the hospital with interviews and some projective testing. Excluding 
from the study any whose screening led to suspicion of serious personality disorder, 
applicants were taken successively as subjects, until a number was obtained to 
equal the patient group. Thus, the comparison group has some criterion of “nor- 
mality” beyond the negative one of no known history of mental disorder. The two 
groups were roughly equal in age, socio-economic status, and sex ratio. 

As in the earlier study, scoring was done by the experimental system, using the 
3-point scale modification; then results were computed into percentages, and the 
differences between groups compared, for each of the scoring categories. The ob- 
tained results show no differences significant at the 5% level or better. Neither was 
there a significant difference between groups as to the number of questions answered. 

Sargent’s publications®: ® have demonstrated that the Insight Test is capable 
of differentiating groups of mental patients from groups presumably “‘normal.’”’ Two 
possible reasons may be mentioned, as to why this scoring system fails so to do. The 
system may be valid in itself, but not be applicable to this particular test, inasmuch 
as the questions, in a number of instances for both groups, failed to yield adequately 
scorable material. Or the rationale of the scoring system may be at fault, in that 
the thought processes identified may actually fail to differentiate “normal” from 
“abnormal” thinking in problem-solving situations. Whether for these reasons, or 
for others, it must be coneluded that this experimental scoring system for the Insight 
Test lacks sufficient validity to be used as a differentiating device. 
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EDITORIAL OPINION 





SMALL SAMPLE STATISTICS 


The paper by Hunt, Arnnorr and Corton on “Reliability, Chance, and 
Fantasy in Inter-Judge Agreement Among Clinicians” in our July 1954 issue de- 
serves the most careful attention from all experimenters using small sample statistics 
because it points up grave inadequacies which are overlooked frequently in uncritical 
research design. Any person with long experience with games of chance will be 
aware of certain vagaries in statistical tendencies which may grossly distort the 
characteristics of small samples of data taken in isolation from the long term trends 
of the factors under study. Every elementary textbook in statistics contains illus- 
trations of the apparent skewings which may occur with sampling inadequacies but 
which are smoothed out with collection of a sufficiently large sampling of data to 
eliminate atypical samplings. 

An interesting phenomenon observed in many clinical fields including medicine 
relates to the fact that initial reports concerning new methods and therapies tend to 
be much more optimistic than subsequent larger scale studies undertaken by the 
profession as a whole. The history of clinical medicine is replete with examples 
where a new drug was introduced with prematurely enthusiastic claims based on 
preliminary laboratory and clinical research, only to be completely discredited by 
the inability of independent workers to replicate the results. This phenomena has 
occurred in connection with research results reported by thoroughly competent 
workers from the most reputable laboratories and based on the most extensive pre- 
clinical investigations possible. It is interesting to speculate on the deficiencies of 
research design causative of overoptimistic findings and claims. 

Every gambler is familiar with the phenomenon known as “beginner’s luck” 
whereby a novice without any knowledge of statistical probabilities as applied to a 
gambling situation may experience long runs of luck. We recognize that statistical 
probabilities will determine whether any beginner will have greater or less than 
chance success, and do not claim that extended studies would show any tendency 
toward skewing in favor of beginners. However, in clinical fields where various re- 
wards await the discoverers of promising new techniques, there are psychological 
factors operating which tend to magnify the importance attached to atypical runs of 
data which may be interpreted as being clinically significant. As can occur in gamb- 
ling where a player (beginner or otherwise) may have a long run of luck, far exceed- 
ing chance expectancies, it is inevitable that atypical skewed samplings will occur in 
studies utilizing ultra-refined statistical treatment of small sample data with sufficient 
frequency so that unduly favorable results will be obtained often enough to explain 
the over-optimistic reports. Examples of this type have been frequently encountered 
in experiments with extrasensory perception in which a particular subject may per- 
form far above chance expectancy for a limited period, only to have the superior 
results taper off with the collection of large scale data in which normal probabilities 
reassert themselves. This “apparent” loss of extrasensory ability is too often inter- 
preted as a genuine loss of ability rather than as a correction of statistical atypicality 
which it is actually caused by. With hundreds of workers investigating new clinical 
developments, it is to be expected that atypical sampling in the work of a few will 
result in overoptimistic research reports which will be corrected when the results of 
all workers are tabulated. This tendency will be particularly observed in situations 
where negative results are disregarded or not as systematically tabulated as positive 
results which carry greater prestige. 

Another statistical phenomenon which may result in gross skewing of data arises 
from overpreoccupation with atypical cases. It is possible to conceive of persons 
who are characterized by either “good” or “bad” luck throughout the entire course 
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of their lives. Indeed, clinical studies will uncover many persons whose life histories 
show consistent overprivilege or underprivilege, freedom from disease or continuing 
incapacitation, escape from the hazards of war and accident or undue susceptibility. 
The frequency of occurrence of totally improbable occurrences whereby a person is 
favored or disfavored by the ‘‘one chance in a billion” indicates that we must always 

be on guard against the appearance of this phenomenon in research data. Errors of 
' this type are particularly apt to occur in research designs using atypical populations 
such as college students, soldiers, criminals or institutional inmates where atypical 
patterns of variables may be operating. Even more important in clinical fields are 
errors related to the specific abilities or disabilities of the judges making the ratings 
or judgments which constitute the data. For example, it is entirely possible that no 
contemporary clinician has any reliable and valid method for diagnosing schizo- 
phrenia, interpreting Rorschach data, or judging the suitability of any method of 
therapy in a particular case. It is also possible that there may exist at the present 
time no universal agreement in the field as a whole, or between any two clinics or 
research groups, concerning what constitutes a “normal” subject, or a psycho- 
neurotic, or an “organic” profile of objective test results, ete. Under such circum- 
stances where there are a large number of completely uncontrolled and even un- 
differentiated variables, is it any wonder that such a universal lack of agreement 
exists in relation to such contemporary methods of psychodiagnosis, interpretation 
of projective test results, evaluation of the results of psychotherapy, and personality 
assessment in general? 

There is e discouraging tendency of researchers to attempt to cover up unsatis- 
factory research design, particularly as regards adequate sampling techniques, by 
use of over-refined small sample statistics. Generally speaking, small sample statis- 
tics may be applied validly only in situations where it may be assumed that the data 
are fairly representative of normal probabilities. Conversely, any evidence of 
skewing or atypicalness of the data must rigidly contraindicate any sweeping inter- 
pretations based on small samples. In fact, it must be assumed that small sample 
data are atypically distributed until normal distribution is demonstrated. A further 
note of caution must be directed against the uncritical interpretation of measures of 
the significance of differences between means. For example, in the analysis of the 
significance of mean intercorrelations between measures of X and Y in groups of 
men and women, it is entirely possible that a statistically significant difference might 
be obtained between mean correlations of .10 and —.05 for men and women res- 
pectively but actually does such a finding have any practical significance when 
viewed in terms of the lowness of the correlations themselves? 

We need to be much more critical of the use of small sample statistics both in 
designing research and in the interpretation of data. Many workers apparently use 
statistical treatment of data as a method of conferring scientific respectability on 
their work and without any clear understanding of the technical points involved. 
As long as we have far reaching conclusions being drawn from studies comparing 
the results of groups consisting of from 5 to 100 subjects each, we may be suspicious 
of the results in direct relation to the smallness and lack of adequate sampling of 
the groups. 

F. C. T. 
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The author is Professor of Psychology at the Fordham University Graduate 
School and a recognized authority in the fields of differential psychology and psycho- 
logical testing. This book is intended as an introductory text to the principles of 
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basic theory of testing, Part II consists of 6 chapters on general classification tests, 
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H. B. Lewis, M. Herrzman, K. Macuover, P. B. MrIssNeR and S. WAPNER on 
the topic of the self-consistency and difference in perception of the space orientation 
of each person. Utilizing a variety of experimental and clinical methods, their re- 
sults indicate a differentiation of two more or less opposite perceptual reaction 
types, i.e. field-dependent vs. independent, analytic perceptual performance. Re- 
lated evidence indicates attitudinal and behavioral correlations of these perceptual 
types with other important dimensions of personality such as relation to the environ- 
ment, control over impulse life, and self concept. This is an important contribution 
with which all clinicians will wish to become familiar. 


CurisTIE, RicHARD AND JAHODA, Marte (Eds.) Studies in the Scope and Method of 
“The Authoritarian Personality’. Glencoe, Ill.: The Free Press, 1954, pp. 279. 
$4.50 
This volume presents a collection of papers by M. Janopa, H. H. HyMAN AND 

P. B. SHeatsitey, RicHarp Curistiz, Harotp D. Lassweii, AND E. FRENKEL- 

BrunswIik evaluating the theoretical and methodological foundations of ApoRNO 

et al’s. work on The Authoritarian Personality. Because of its important social im- 

plications, the concept of the authoritarian personality deserves the most careful 

objective evaluation and replication. The authors discuss some of the theoretical 
and methodological short-coming of the original research and outline some alterna- 
tive implications. 


Wo.serG, Lewis R. The Technique of Psychotherapy. New York: Grune and Strat- 
ton, 1954, pp. 869. $14.75 


The author is Director, Postgraduate Center for Psychotherapy, New York 
City, and Clinical Professor of Psychiatry at the New York Medical College. Based 
on the methods and case materials collected over the years at the Center, this book 
presents a soundly organized and clearly written account of the detailed methods 
which the author has found valuable in his experience. The basic orientation is 
eclectic with proper emphasis on the indications and contraindications for special 
techniques. Verbatim transcriptions of interviews provide a wealth of teaching 
material for the student. Many of the finer points of psychotherapy are discussed 
in great detail together with extensive explanations of the technical points at issue. 
This is one of the more important contributions to the field of the past decade. 


HALPERN, FLoreNcE. A Clinical Approach to Children’s Rorschachs. New York: 
Grune and Stratton, 1953, pp. 270. $6.00 


Dr. Halpern’s extensive experience of literally thousands of children’s Ror- 
schachs is summarized here in a detailed presentation of her system of interpretation. 
In addition to general discussions of test administration, scoring, and the signi- 
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ficance and interpretation of test factors, 40 illustrative records from all types of 


childhood disorder are presented with detailed discussion of scoring and interpreta- 
tion. 


Heatu, Rosert G. (Chairman). Studies in Schizophrenia. Cambridge, Mass.: 
Harvard University Press, 1954, pp. 619. $8.50 


This volume presents a malti-disciplinary approach to mind-brain relationships 
conducted by the Tulane University Department of Psychiatry and Neurology 
studying neurohumoral mechanisms in both animals and man in relation to stress. 
Operative techniques were developed with animal and human subjects for implant- 
ing electrodes for electrically stimulating subcortical areas directly. Detailed phy- 
siologic, biochemical, neurologic and psychological data were collected relating to 


changes observed following electrical stimulation by open surgical and closed stereo- 
taxic methods. 


Strong, C. P. (Ed.) Annual Review of Psychology. Vol. V, 1954. Stanford, California: 
Annual Revi iews, Inc., 1954, pp. 448. $7.00 
This fifth volume of the Annual Review series continues the authoritative and 
concise reporting of the most important recent developments in all major fields 
of psychology. The more important research contributions in each field are reported 
in detail with critical evaluations of findings. 


Froscu, JOHN et al (Eds.) The Annual Survey of Psychoanalysis. Vol. II. New York: 
International Universities Press, 1954, pp. 724. $10.00 
This second volume presents a compilation and interpretive summary of the ‘ 
psychoanalytic literature for 1951. The more important papers for that year are 
presented and discussed in detail. The style is very readable and this series will com- 
prise a valuable reference work for students and others who wish to keep abreast of 
the latest developments in psychoanalysis. 


SAPPENFIELD, Bert R. Personality Dynamics. New York: Knopf, 1954, pp. 412 

+ xvi. $5.50 

The author is professor of psychology at Montana State University, This book 
attempts to integrate an organismic concept of behavior with the principal psycho- 
dynamic mechanisms. The modes of expression of each mechanism are described 
and interpreted in terms of their dynamic interrelations with anxiety. The style of 
presentation is almost entirely on a theoretical basis with occasional illustrative 
excerpts from the literature. In so far as psychoanalytic mechanisms are able to 


explain personality dynamics, this is a very readable textbook on the psychology of 
adjustment. 


Git, Merton; Newman, RicHarp AnD Rep.iicu, Freprick C. The Initial Inter- 
view in Psychiatric Practice. New York: International Universities Press, 1954, 
pp. 423. $6.00. With supplementary album of three phonograph records sold 
at cost of $4.60 each consisting of verbatim transcriptions of three interviews. 
From the Yale University Department of Psychiatry, the authors present 

theoretical discussions of the initial interview in psychiatric practice with verbatim 
transcriptions of three typical initial interviews presented for teaching purposes both 
in the text and also transcribed on an album of phonograph records. Interpretive 
comments are also presented in the text on alternate pages so that the student may 
understand the dynamics of the interview. The basic orientation of the interviews 
is eclectic. 


Driver, HELEN I. Multiple Counseling. Madison, Wis.: Monona Publications, 1954, 
pp. 280. 

_ The term ‘Multiple Counseling” refers to the systematic utilization of individ- 
ual counseling with small group discussion methods in catalysing personal growth. 
The author reports in detail her methods and experiences in multiple counseling 
with illustrative case materials. 
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CARMICHAEL, LEONARD (Ed.) ‘Manual of Child Psychelogy. New York: John Wiley,, 
1954, pp. 1295. $12.00 
This is a revised second edition of the authoritative manual published originally 
in 1946. Twenty-two distinguished authorities contribute discussions of the major 


psychological aspects of childhood, all of which are of uniform high standards and 
style. 


AvsuBEL, Davip P. Theory and Problems of Adolescent Development. New York: 
Grune and Stratton, 1954, pp. 580. 


June, C. G. The Practice of Psychotherapy. New York: Pantheon Books, 1954, pp. 
377. $4.50 
Collected essays on the psychology of transference and other subjects from the 
works of C. G. Jung. 
Braatoy, TryGve. Fundamentals of Psychoanalytic Technique. New York: John 
Wiley, 1954, pp. 404. $6.00 
A prominent Norwegian psychiatrist appraises the contributions of psycho- 
analysis from the viewpoint of his wisdom and personal experience. 


BaRRELL, JosePpH. A Philosophical Study of the Human Mind. New York: Phil- 
osophical Library, 1954, pp. 575. $6.00 


Micnat-SmitTH, H. Pediatric Problems in Clinical Practice. New York: Grune & 
Stratton, 1954, pp. 310. $5.50 


Burt, Cyr. The Causes and Treatment of Backwardness. New York: Philosophical 
Library, 1953, pp. 128. $3.75 


ABRAMSON, Haro.p A. (Ed.) Problems of Consciousness. New York: Josiah Macy, 


Jr. Foundation, 1954, pp. 177. $3.25 
Transactions of the Fourth Conference on Problems of Consciousness sponsored 
by the Josiah Macy, Jr. Foundation. 
Various authors. Learning Theory, Personality Theory and Clinical Research. New 
York: John Wiley, 1954, pp. 164. $3.50 
Reports of eleven papers presented at a symposium held by the University of 
Kentucky on these topics in 1952. 
Moore, Merrity. Case Record from a Sonnetorium. New York: Twayne Pub- 
lishers, 1951, pp. 50. $1.50. 


Reflections on human life and the world in liberated sonnets, with incidental 
cartoons by Edward Gorey. 
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NOTICE 





Announcement has been made of the completion of plans for a new quarterly 

age Archives of Criminal Psychodynamics to be printed by The Lord Baltimore 
ress, with Ben Karpman, M. D. as Editor and Melitta Schmideberg, M. D. as 

Associate Editor. The rest of the Editorial Board is being formed and at present 
consists of the following: Walter Bromberg, M. D., Jacob H. Conn, M. D., Wladimir 
G. Eliasberg, M. D., Arthur N. Foxe, M. D., George E. Gardner, M. D., Leo Kan- 
ner, M. D., Samuel B. Kutash, Ph.D., Lawson G. Lowrey, M. D., Sydney B. 
Maughs, M. D., Lester W. Sontag, M. D. Foreign correspondents and contributors 
are: Jose Belbe, M. D., Buenos Aires, Argentina; Mme. Marie Bonaparte and Daniel 
Lagache, M. D., Paris, France; Edward Glover, M. D., London, England; Kenji 
Ohtsuki, M. D., Tokyo, Japan. 

The Archives of Criminal Psychodynamics will be psychoanalytically oriented. 
It will devote itself to the encouragement of research into the psychodynamics of 
antisocial and criminal behavior, the interpretation and dissemination of the existing 
knowledge of the same; promotion of superior legal and humane understanding of 
the relations involved between the criminal and the society in which he lives and the 
betterment of the condition of the criminal as an individual. It will attempt to 
crystallize all available thoughts on the subject. It will, therefore, publish original 
articles dealing with all phases of antisocial and criminal behavior. It will attempt 
to correlate these with other psychiatric and extra-psychiatric disciplines such as 
sociology, criminology, anthropology, biology and medicine as they appear to relate 
to the problem of antisocial behavior. : 

The first number is scheduled to appear in January, 1955. Inquiries and man- 
uscripts will be promptly acknowledged and answered, Further information may be 
obtained from the Editor, Ben Karpman, M. D., Station L, Washington 20, D. C. 
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100 MILLIMETERS 
‘ 


INSTRUCTIONS Resolution is expressed in terms of the lines per millimeter recorded by a particular 
film. under specified conditions. Numerals in chart indicate the number of lines per millimeter in adjacent 
“T-shaped” groupings. 

In microfilming, it is necessary to determine the reduction ratio and multiply the number of lines in the 
chart by this value to find the number of lines recorded by the film. As an aid in determining the reduction 
ratio, the line above is 100 millimeters in length. Measuring this line in the film image and dividing the length 
into 100 gives the reduction ratio. Example: the line is 20 mm. long in the film image, and 100/20 = 5. 


Examine “T-shaped” line groupings in the film with microscope, and note the number adjacent to finest 
linés recorded sharply and distinctly. Multiply this number by the reduction factor to obtain resolving power 
in lines per millimeter. Example: 7.9 group of lines is clearly recorded while lines in the 10.0 group are 
not distinctly separated. Reduction ratio is 5, and 7.9 x 5 = 39.5 lines per millimeter recorded satisfacto- 
rily. 10.0 x 5 §0 lines per millimeter which are not recorded satisfactorily. Under the particular condi- 
tions, maximum resolution is between 39.5 and 50 lines per millimeter. 


Resolution, as measured on the film, is a test of the entire photographic system, including lens, exposure, 
processing, and other factogs. These rarely utilize maximum resolution of the film. Vibrations during 
exposure, lack of critical focus, and exposures yielding dense negatives are to be avoided. 





